Tuesday, March 30, 2010

Interfaces and Productions

Grogix is Turing Complete -- these days, it seems to take special effort to create a language that isn't. A Grogix 'operational grammar' can produce the same results as the language in which it's implemented, and generally, the platform upon which it executes. After all, you can do anything in the 'context interface', and write the grammar so that it gives back to the interface the permission to "do its thing". The programmer can then gradually transfer functional structure to the grammar, from the interface.


Since the 'context interface', or interface to the outside world, makes this unusual grammar into a fully functional machine, I'd like to reveal something more about of the general structure of interfaces:


With the appropriate hooks in the traversal routine, the interface can determine a few other things about the handling of conditional productions. The notion of storage in the two presented Grogix programs, as well as the unusual meaning assigned to grammar and storage in the wiki_strings example, are just a few examples of the wide range of perfectly natural interpretations of productions in Grogix.

Some notes on this topic:

* There's no reason for the right-hand side of the production (which is sometimes called the successor) to be an ordered set or template of symbols and non-terminals. It could be unordered, generating combinations rather than permutations. This reflects certain ideas of the Minimalist Program in linguistics, for example the possibility that the "beads on a string" order (as opposed to the order of syntactical transformations) happens rather late in the production of speech (close to the sensorimotor interface). When we avoid left-to-right order in some productions, let's call them "unordered successors", we are clearly defining additional opportunities for parallel computations in the system.

* There's no reason the core generated behavior needs to be a returned string. For example, the right-side or successor side of the production could be a series of commands, executed immediately or later. The production could use any set of symbols or media that has a meaning given to it by the interface. It is up to the interface traversing the grammar to interpret the right-side expression.

* The operational grammar is not a 'grammar' in the particular sense that it is not about language syntax -- although it can be used to represent language in the standard way. Grogix looks like a grammar, and has computational properties and logical operations, as all systems labeled 'grammatical' do, and so I call it a grammar. But someone could differ with my calling this a grammar, reasonably. On the other hand, I believe Grogix seems so useful exactly because it has syntactic structure without referring to language, while at the same time being itself a programming language.

Sunday, March 28, 2010

Modularity within an Operational Grammar

In the previous example, I purposely left a fairly obvious section of "straight HTML" at the bottom of the grammar.

The detail in this section could be turned into a grammar "module", shrinking the "core" grammar, making it more comprehensible.

WikiBase.gx

. start@ :: {{ wiki }}

. wiki@url_name(page) :: {{wiki_page}}
. wiki@url_name(edit) :: {{wiki_edit_page}}
. wiki@ :: {{wiki_page}}

. wiki_page@url_name(page) :: {{ html_page( *page, wiki_content(*page) ) }}
. wiki_page@ :: {{ html_page( "WikiHome", wiki_content("WikiHome") ) }}

. wiki_edit_page@ :: {{ html_page( *edit, wiki_edit ( *edit ) ) }}

. wiki_content@ :: {{ wiki_top ($1) }} {{ wiki_text ($1) }} {{ edit_link ($1)}} {{ wiki_footer }}

. wiki_edit@ :: {{ wiki_top ($1) }} {{ wiki_form ( wiki_text ) }} {{ wiki_footer }}

. wiki_text@url_name(edit) :: {{ wiki_text.value }}
. wiki_text@ :: {{ wiki_strings (wiki_text.value) }}

. edit_link@ :: {{a_link ("/edit="+$1,"Edit "+$1)}}

. wiki_strings@pattern("\<[^<>]*\>") ::
. wiki_strings@pattern("([A-Z][a-z]+)+") :: {{ a_link("/page="+*1,*1) }}
. wiki_strings@pattern("---+") :: {{ hr }}
. wiki_strings@pattern("\*") ::    •

. wiki_top@ :: <div style="font-weight:bold;">{{ a_link( "/page="+$1, $1 ) }}</div>
. wiki_footer@ :: copyright © {{*year}}

. wiki_form@ :: <form action="/page={{*page}}" method="post">{{form_textarea(*page)}}{{form_button}}</form>

. form_textarea@ :: <textarea name="{{*page}}">{{wiki_text}}</textarea>
. form_button@ :: <input type="submit" name="button" value="save">

... html

It seems like the next nearly modular sub-grammar is the wiki_strings section that handles the retrieval of string content. Obviously this section would be much larger in, say, the WikiMedia codebase. So it makes sense to abstract it away from the core grammar.

WikiBase.gx

. start@ :: {{ wiki }}

. wiki@url_name(page) :: {{wiki_page}}
. wiki@url_name(edit) :: {{wiki_edit_page}}
. wiki@ :: {{wiki_page}}

. wiki_page@url_name(page) :: {{ html_page( *page, wiki_content(*page) ) }}
. wiki_page@ :: {{ html_page( "WikiHome", wiki_content("WikiHome") ) }}

. wiki_edit_page@ :: {{ html_page( *edit, wiki_edit ( *edit ) ) }}

. wiki_content@ :: {{ wiki_top ($1) }} {{ wiki_text ($1) }} {{ edit_link ($1)}} {{ wiki_footer }}

. wiki_edit@ :: {{ wiki_top ($1) }} {{ wiki_form ( wiki_text ) }} {{ wiki_footer }}

. wiki_text@url_name(edit) :: {{ wiki_text.value }}
. wiki_text@ :: {{ wiki_strings (wiki_text.value) }}

. edit_link@ :: {{a_link ("/edit="+$1,"Edit "+$1)}}

. wiki_top@ :: <div style="font-weight:bold;">{{ a_link( "/page="+$1, $1 ) }}</div>
. wiki_footer@ :: copyright © {{*year}}

. wiki_form@ :: <form action="/page={{*page}}" method="post">{{form_textarea(*page)}}{{form_button}}</form>

. form_textarea@ :: <textarea name="{{*page}}">{{wiki_text}}</textarea>
. form_button@ :: <input type="submit" name="button" value="save">

... wiki_strings
... html

Although it doesn't shorten the core grammar much, moving the 3 lines of the edit form to its own module makes sense, given how complex an editor can become.

WikiBase.gx

. start@ :: {{ wiki }}

. wiki@url_name(page) :: {{wiki_page}}
. wiki@url_name(edit) :: {{wiki_edit_page}}
. wiki@ :: {{wiki_page}}

. wiki_page@url_name(page) :: {{ html_page( *page, wiki_content(*page) ) }}
. wiki_page@ :: {{ html_page( "WikiHome", wiki_content("WikiHome") ) }}

. wiki_edit_page@ :: {{ html_page( *edit, wiki_edit ( *edit ) ) }}

. wiki_content@ :: {{ wiki_top ($1) }} {{ wiki_text ($1) }} {{ edit_link ($1)}} {{ wiki_footer }}

. wiki_edit@ :: {{ wiki_top ($1) }} {{ wiki_form ( wiki_text ) }} {{ wiki_footer }}

. wiki_text@url_name(edit) :: {{ wiki_text.value }}
. wiki_text@ :: {{ wiki_strings (wiki_text.value) }}

. edit_link@ :: {{a_link ("/edit="+$1,"Edit "+$1)}}

. wiki_top@ :: <div style="font-weight:bold;">{{ a_link( "/page="+$1, $1 ) }}</div>
. wiki_footer@ :: copyright © {{*year}}

... wiki_edit
... wiki_strings
... html

What we're left with is, perhaps, what we'd expect from WikiBase: the large-scale structure of a web service with two dynamic web pages.

I imagine clicking on an ellipsis (or hitting a key when it's highlighted) in order to see the sub-grammar. That sub-grammar of course will have its own module declarations etc.

Note that this makes immediate opportunities for (1) standardized definitions for interaction with modules and (2) standardized modules.

In a way, the grammar rather forced us to do this. It's slightly different from the way it forced us to separate the context from the grammar: the context interface became defined because the context wasn't part of the system derivation we focused on. That is, if you focus on the structure of a system, a context will always emerge that will make it easier to define the system grammatically. Modules are slightly different. In the process of deriving the system grammar, we notice sub-grammars that are quite modularized, even though they may be used pervasively by the grammar. We keep them in the grammar, but we put them aside. This is different from the context interface, which would be difficult to include in the grammar. It's really a matter of where you start. If we wrote a different grammar, with a focus on building a web server, then the context interface would be different -- it would be the operating system. A wiki, or any web service, would be something so far down from a web server grammar, that it would be running inside a standardized module (the application container).

I think it's a strength of this approach that, no matter where you start, it seems like you'll derive a rather similar layering of operational grammars. This is probably because we're focusing so intently on the structure of the functionality: there is no reason to believe that the solutions would be very different.

Thursday, March 25, 2010

Decimating the WikiBase

The Wiki was invented by a talented programmer who wanted to let his friends store patterns of software design modeled after Christopher Alexander's work in A Pattern Language.

The original Perl code is known as WikiBase, and after removing the HyperPerl structure, inspired by Donald Knuth's Literate Programming ideas, the code runs to about:

300 lines

Below is a functionally equivalent Grogix program, which I wrote in about an hour, and which could be better, but runs to about:

30 lines

The difference is Grogix, not me. Operational Grammars compact the multiplicative effects in a system, and clarify system structures and their appropriate order of emergence. Real-world context is pushed into the interface, and all that remains is the unfolding structure.

When looking at this example, please remember that an Operational Grammar represents a full executing system, not just the syntax of a language.

In this example, I've made some changes from previous posts.

1) The ellipses are gone, because people inferred that something was missing, which was not the case.

2) I've marked the three parts of a Conditional Production with '@' and '::' :

nonterminal@condition :: template

3) I've removed the keyword for the 'default' condition. So a default condition looks like:

nonterminal@ :: text

And of course the default condition goes at the end of any run of productions for the same nonterminal. They are evaluated in order.

4) I also changed the right-hand side's {{nonterminal::parameter1,parameter2}} notation to {{ nonterminal( parameter1,parameter2 ) }}. These compose, so: {{ nonterminal1( parameter1, nonterminal2( parameter1, parameter2 ) ) }} ... note the productions access parameters with $1, $2 etc ... these will be name-able in the future.

5) As before, every nonterminal is a potential persistent data type, where its access by .value triggers the persistence mechanism in the interface.

The interface is somewhat different than in the previous example, allowing usages like the patterns-parsing of "wiki_strings" -- that's the strength of the interface-separation approach. New kinds of interface functionality can be added, but it's still a grammar, representing the structure of execution of a system.


WikiBase.gx

. start@ :: {{ wiki }}

. wiki@url_name(page) :: {{wiki_page}}
. wiki@url_name(edit) :: {{wiki_edit_page}}
. wiki@ :: {{wiki_page}}

. wiki_page@url_name(page) :: {{ html_page( *page, wiki_content(*page) ) }}
. wiki_page@ :: {{ html_page( "WikiHome", wiki_content("WikiHome") ) }}

. wiki_edit_page@ :: {{ html_page( *edit, wiki_edit ( *edit ) ) }}

. wiki_content@ :: {{ wiki_top ($1) }} {{ wiki_text ($1) }} {{ edit_link ($1)}} {{ wiki_footer }}

. wiki_edit@ :: {{ wiki_top ($1) }} {{ wiki_form ( wiki_text ) }} {{ wiki_footer }}

. wiki_text@url_name(edit) :: {{ wiki_text.value }}
. wiki_text@ :: {{ wiki_strings (wiki_text.value) }}

. edit_link@ :: {{a_link ("/edit="+$1,"Edit "+$1)}}

. wiki_strings@pattern("\<[^<>]*\>") ::
. wiki_strings@pattern("([A-Z][a-z]+)+") :: {{ a_link("/page="+*1,*1) }}
. wiki_strings@pattern("---+") :: {{ hr }}
. wiki_strings@pattern("\*") ::    •

. wiki_top@ :: <div style="font-weight:bold;">{{ a_link( "/page="+$1, $1 ) }}</div>
. wiki_footer@ :: copyright © {{*year}}

. wiki_form@ :: <form action="/page={{*page}}" method="post">{{form_textarea(*page)}}{{form_button}}</form>

. form_textarea@ :: <textarea name="{{*page}}">{{wiki_text}}</textarea>
. form_button@ :: <input type="submit" name="button" value="save">

. a_link@ :: <a href="{{$1}}">{{$2}}</a>

. hr@ :: <hr \>
. br@ :: <br \>

. html_page@ :: {{ html_header( $1 ) }} {{ html_body( $2 ) }}
. html_header@ :: {{doctype}} <head><title>{{ $1 }}</title>{{html_meta}}</head>
. html_body@ :: <body>{{ $1 }}</body>
. html_meta@ :: <meta http-equiv="content-type" content="text/html; charset=utf-8"/>
. doctype@ :: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">


There are tools I'll build that should make this kind of programming even easier. I imagine an Eclipse plug-in that has two easily-switchable navigation modes: (1) scrolls around the whole grammar and (2) shows only the parent and child productions of whatever production your cursor is on. I'm guessing that this mode, with state discrimination added, could help expand the overlaid trees of conditional grammars, assisting our comprehension of even more complex systems, and producing a kind of "programming by borrowing" of grammar sub-trees ... anyway, much of the problem of navigating complex systems is mitigated by the one-grammar-per-system idea. It's a great model for portability, with a built-in idea for well-defined separation and integration between engine and interface. Additional systems (especially the interfaces) will have their own grammars too ... each layer in a real-world "interface stack" can be a system with its own operational grammar.

Sunday, March 21, 2010

The Conditional Production

The primary notion in Grogix is the "extraction" of structure, or essence, or form, "from" the system being designed, and the "concentration" of that structure "into" the operational grammar, which executes the system.

The basic construct that makes this possible is The Conditional Production. The typical grammar used for parsing will pair patterns with actions, which can be transformations. In contrast, the Conditional Production in the grammar only defines the non-terminal if the condition is met at run-time. This, along with the recursion found in any grammar, allows the definition of a single, finite grammar with infinite numbers of state combinations. Any programming language with iteration or recursion can do this, but only Grogix puts it into a single grammatical structure.

Grogix's "hello world" example:

. start ... default ... hello world

And a more complex version, making use of Conditional Productions:

. start ... default ... hello {{name}}
. name ... *user=="greg" ... mister G
. name ... default ... {{*user}}

"default" is the default or base condition. When a conditional expression that evaluates true is reached, moving from the top to the bottom of the list of productions, that production becomes active.

In these examples, the far right, past the second ellipsis, is a text template, marked or escaped by these {{ call outs }} to further non-terminals or variable values.

The * construction represents an interface variable.

The grammar runs within an interface, and the grammar has various ways of receiving values from the interface. The interface here is not defined, making the grammar a kind of operational idealization, which can be used with multiple interfaces. One interface that we can imagine for these examples is a terminal program which has the user's login name available to it, which it then provides to the grammar. The grammar exits when the tree is fully traversed. We could imagine the interface then provides this result to the user, and quits. We could also imagine the exact same grammar for a web-based terminal service. A grammar can be written to remain abstract from such considerations, making it ideal for portability, or insertion into legacy situations. It could be also contain layers of sets of productions specifically written to deal with the outside world beyond more transparent interfaces, and these layers might be swapped out for different outside contexts.

None of this would make much sense without the key idea of a central grammar executing the system. With Grogix, and Blooming Logic, one has to think of grammars differently, as the central organizing notation for computation, rather than as a tool for defining language syntax.

Monday, March 15, 2010

Depth and Innate Judgements

This post was moved to the Chomsky & Alexander blog, along with future posts about Grogix that are strictly philosophical in nature.

Tuesday, March 9, 2010

Grammar-centric programming

Let's look at a small Grogix example. Grogix stands for "GRammar-Operated General information computing system"

Again, the notion is to keep the entire system coherent through a single, central grammar.

Around the grammar is an interface to the world. The interface allows the recursive traversal procedure, the one that interprets the grammar, to drive action outside, and receive state-change information from the outside.

This particular interface is a Python program, acting as both a Grogix interpreter, and as a liaison with the Google App Engine framework.

Here's a Grogix grammar for a simple wiki:

. start ... start ... {{default_page}}
. default_page ... default ... {{page_contents}}

. page_contents ... name_value(page) ... {{content_section}}
. page_contents ... name_value(site) ... {{page_section}}
. page_contents ... default ... {{site_section}}

. site_section ... no_item(site) ... {{site_cycle}}
. site_section ... default ... List of Sites: {{sites}}{{br}}{{site_cycle}}{{br}}{{br}}{{br}}{{hr}}{{br}}

. sites ... default ... {{sites:site}} {{br}} <a href="/site={{site.name}}"> {{site.name}}</a> : {{site.value}}
. site_cycle ... post ... {{site_cycle_post_response}}
. site_cycle_post_response ... name_value(add) ... success
. site_cycle ... get ... {{site_cycle_get_response}}
. site_cycle_get_response ... name_value(action) ... {{standard_form}}
. site_cycle_get_response ... default ... <a href="/action=add&item=site">Add a site</a>

. page_section ... no_item(page) ... Site: {{*site}}{{br}}{{page_cycle}}
. page_section ... default ... <a href="/">Back to site overview</a>{{br}}Pages:{{pages}}{{br}}{{page_cycle}}{{br}}{{br}}{{br}}{{hr}}{{br}}

. pages ... default ... {{pages:page}} {{br}} -> <a href="/site={{*site}}&page={{page.name}}">{{page.name}}</a> : {{page.value}}
. page_cycle ... post ... {{page_cycle_post_response}}
. page_cycle_post_response ... name_value(add) ... success
. page_cycle ... get ... {{page_cycle_get_response}}
. page_cycle_get_response ... name_value(action) ... {{standard_form}}
. page_cycle_get_response ... default ... <a href="/site={{*site}}&action=add&item=page">Add a page</a>

. content_section ... no_item(content) ... Page: {{*page}}{{br}}{{content_cycle}}'
. content_section ... default ... <a href="/">Back to site overview</a>{{br}}Content:{{contents}}{{br}}{{content_cycle}}{{br}}{{br}}{{br}}{{hr}}{{br}}

. contents ... default ... {{contents:content}} {{br}} {{content.name}} {{br}} <b>{{content.value}}</b>
. content_cycle ... post ... {{content_cycle_post_response}}
. content_cycle_post_response ... name_value(add) ... success
. content_cycle ... get ... {{content_cycle_get_response}}
. content_cycle_get_response ... name_value(action) ... {{standard_form}}
. content_cycle_get_response ... default ... <a href="/site={{*site}}&page={{*page}}&action=add&item=content">Add content</a>

. standard_form ... default ... <form action="{{form_target}}" method="post"> {{form_name_textarea}}{{form_value_textarea}}{{standard_form_button}}{{form_hiddens}}</form>

. form_name_textarea ... default ... name: <textarea style="height:30px;width:300px;background:#eeee00;" name="name"></textarea>{{br}}

. form_value_textarea ... default ... value: <textarea style="height:30px;width:300px;background:#eeee00;" name="value"></textarea>{{br}}

. standard_form_button ... default ... <span align="left"><input type="Submit" name="button" value="{{standard_form_button_text}}"></span>

. form_target ... add(site) ... /action=add&item=site
. form_target ... add(page) ... /site={{*site}}&action=add&item=page
. form_target ... add(content) ... /site={{*site}}&page={{*page}}&action=add&item=content

. form_hiddens ... default ... {{fh::"landing_site",*site}} {{fh::"next_start","start"}} {{fh::"landing_page",*page}}

. fh ... default ... <input type="hidden" name="{{$1}}" value="{{$2}}">

. standard_form_button_text ... add(site) ... Add Site
. standard_form_button_text ... add(page) ... Add Page
. standard_form_button_text ... add(content) ... Add Content

. br ... default ... <br />

. hr ... default ... <hr />



So, what are you looking at here?

It's a grammar. You can see the "start" non-terminal at the beginning.

It's an ordered set of non-terminal productions. You can see the non-terminal on the left is defined by the phrase on the right ... and the phrase on the right is clearly a text-based template ... the production is 'guarded' by a conditional clause between the non-terminal and the template-like phrase:

. [non-terminal] ... [conditional] ... [defining phrase]

The reason I precede each production with a "dot", is that I like to have free comments, to explain what is going on. This is a nod to Donald Knuth's important observations regarding Literate Programming.

The ordered conditionals determine which productions are followed at run-time.

The defining production itself has further non-terminal references in {{ braces }}.

The productions can also inherit attributes passed from (or bequeathed by) its parent production, which uses this construction: {{ nt :: 'literal', *interface_value, $[inherited_value] }}.

The productions can use information directly from the interface, and can modify interface values, including where the interface should "start" in the grammar on various occasions.

There's an assumed general data-type which associates with a non-terminal, and can be assigned values. It is wound and unraveled with the {{sites:site}} recursive construction, with constraints selected in this way {{contents : content [site,page] }}, where site and page are in this case constraints on the data by those values, passed from the interface.

Note that this was my first working version ... the program itself can obviously be shortened considerably to achieve the same end (there's a repetition I was using while debugging some properties of the interpreter ... discovering this is perhaps a useful exercise for the reader). In any case, in my next installment, I'll show a better grogix grammar for a wiki closer in definition to the original, and deploy the grammar-driven Google App Engine online, so people can try it for themselves.

Monday, March 8, 2010

Grogix: the GRammar-Operated General information computing system

I'm motivated by the vast chasm between the products of human engineering and those of the biological world.

I mentioned this here. I'm using the techniques in Christopher Alexander's The Nature of Order, that is, using feeling as a test procedure. The test procedure uncovers what has strong biological coherence, and what doesn't. It is analogous to Noam Chomsky's linguistic test procedure, to determine what is, and what is not, part of the human faculty of language. The Alexandrian version of the Chomsky approach, has helped me to find a strong notation, one that can actually give us a comfortable, natural, humane encoding for systems with the mind-boggling complexity of natural organisms.

At the center of Alexander's current approach is the notion of coherence. Some fields of centers ('centers' are things with a field-like property, without necessarily having a distinct boundary) work together, and support each other, holistically, with different degrees of coherence.

I considered what it was that best held everything together coherently, at least for me, in a computing system. And I know it sounds trite, but it seems that I always come back to the idea that a grammar defines the system -- it's kin to Aristotle's loosely-hierarchical formal cause, the essence of a thing, which still works to describe structure everywhere in the universe.

Now, even though programming systems always seem to have some holistic grammatical properties, for some reason we never express them grammatically. We express them as categories, functions, sets, classes, statements etc. But the only system that actually looks and feels like the grammar of language, is the attribute-grammar toolset of YACC, and its derivatives.

For decades, I've said to myself: "why do we only use grammars at the input and the output of a system?" We know from various kinds of re-writing systems (L-systems and Markov algorithms) that grammars are very powerful computationally. As a consequence of Chomsky's linguistics work, it's clear that various kinds of deep grammar must be at the core of our natural enumerative capacity. So, why are we using functional notation, or trivial imperative notation, when grammar is actually the heart of the form of a system?

I also take to heart Feyerabend's prescription for progress: we've been incrementally building a body of doctrine about programming language semantics, in part to maintain logical consistency in the face of engineering demands -- yet we seem no closer to bridging the chasm between ourselves and nature. Maybe we need to throw out our carefully-constructed mathematical deductions. Maybe we need to dig more deeply, not just into what nature is doing, not just into what mathematics calls for, but into what we do best, mentally, as natural organisms. Our mathematical and symbolic ability is a consequence of our faculty of language. What could be more natural than putting language at the center?

So grogix: a grammar-operated general information computing system. The grammar is the heart of the system. You can have other systems. But if you want a coherent system, it must have one grammar, explicitly, facilitated and protected, at its heart.

Grogix 0.2

Since October of 2009, I was forced to take grogix in a new direction [I now call this old direction, interesting in itself, SUPL]. The old direction was, essentially, a macro preprocessor for generating programs ... albeit one with a well-integrated notion of grammar, morphogenetic sequences, pattern-like steps in those sequences, etc. It's still a useful authoring tool for the kind of analysis described in the sample sequence at www.corememory.org.

But I began to realize: no matter how natural or well-motivated or well-described, a hierarchical expression of the form of a generated Python program was not going to make me happy. I'm just too unsatisfied with our current programming paradigm, which always puts logical consistency before all else, despite the fact that logical consistency has become so straightforward and commonplace that young programmers can create turing complete notations without even knowing that they are doing so.

So, one-by-one, I took parts of grogix-the-macro-processor which I felt were natural and well-motivated, and integrated them into grogix-the-programming language.