% The grammar-combinators Parser Library
% Dominique Devriese
% 2010

<div id="body">
# News
* 17-4-2011: Grammar-combinators v0.2.5 is released. Fixes incompatibility
  with UUParse 2.7.

* 17-2-2011: Grammar-combinators v0.2.4 is released. Home page back to
  projects.haskell.org.

* 10-2-2011: Grammar-combinators v0.2.3 is released. Changed home page
  because of unavailability of projects.haskell.org. Small fix for
  backward compatibility with GHC 6.

* 31-1-2011: Grammar-combinators v0.2.2 is released. This release
  makes the library work with GHC 7 (thanks to Doaitse Swierstra for
  reporting the problem).

* 5-1-2011: Grammar-combinators v0.2.1 was released. This release
  contains a new [grammar combination primitive], the early start of a
  [reusable grammar library], some experiments with bias introduction
  for Parsec and penalty-based error handling, more work on TH lifting
  of grammars etc. There is also an important bugfix in the uniform
  Paull transformation (thanks to Andries Verreydt).

[grammar combination primitive]: http://hackage.haskell.org/packages/archive/grammar-combinators/0.2.1/doc/html/Text-GrammarCombinators-Transform-CombineGrammars.html
[reusable grammar library]: http://hackage.haskell.org/packages/archive/grammar-combinators/0.2.1/doc/html/Text-GrammarCombinators-Library-Numeric.html

# Introduction

The **grammar-combinators** library is an experimental next-generation
parser library written in Haskell ([LGPL] license). The library
features much of the power of a parser generator like Happy or ANTLR,
but with the library approach and most of the benefits of a parser
combinator library. Read the [Tutorial] to learn how the library
works.

The library is currently not intended for mainstream use. Its API is
relatively stable, but performance needs to be looked at further (see [below](#performance)).

[LGPL]: http://www.gnu.org/licenses/lgpl.html

# Features

* **Grammars defined in Haskell** using an elegant syntax
* **Functional style grammar algorithms** (no fresh non-terminal
  identifier etc.), with elegant and meaningful types.
* **Multi-backend**: use the same grammar with a [Packrat], [Parsec]
  or [UUParse] parser
* **Grammar transformations**: contains a powerful grammar
  transformation library, with the [left-corner left-recursion removal
  transform], a uniform version of the classic [Paull left-recursion
  removal][Paull transformation], and various smaller transformations
  (dead-branch removal, dead non-terminal removal, consecutive epsilon
  combination, selective unfolding etc.)
* **Grammar utility functions**: printing of grammars, [FIRST-set]
  calculation, reachability analysis of non-terminals, etc.
* **Compile-time transformations** (using [Template Haskell]), given a
suitable definition of the grammar. This is currently limited to a
certain set of transformations.

[UUParse]: http://www.cs.uu.nl/wiki/bin/view/HUT/ParserCombinators
[Parsec]: http://www.haskell.org/haskellwiki/Parsec
[Packrat]: http://en.wikipedia.org/wiki/Parsing_expression_grammar#Implementing_parsers_from_parsing_expression_grammars 
[FIRST-set]: http://en.wikipedia.org/wiki/LL_parser#Constructing_an_LL.281.29_parsing_table
[left-corner left-recursion removal transform]: http://portal.acm.org/citation.cfm?id=974338&dl=GUIDE,
[Paull transformation]: http://en.wikipedia.org/wiki/Left_recursion#Removing_left_recursion
[Template Haskell]: http://www.haskell.org/haskellwiki/Template_Haskell

# Download

* Source code
    * [sdist] on hackage
    * Clone the darcs repository: 
          `darcs get http://code.haskell.org/grammar-combinators/`
* Install:
  `cabal update && cabal install grammar-combinators`
* [Hackage info page]: also contains Haddock docs
* [Tutorial]

[Hackage info page]: http://hackage.haskell.org/package/grammar-combinators
[sdist]: http://hackage.haskell.org/packages/archive/grammar-combinators/0.1/grammar-combinators-0.1.tar.gz
[Tutorial]: tutorial.html

# Tutorial
 
The best way to learn more about this library is to read the
[Tutorial]. After that, install the library and play with the code.

# Limitations
* Compositionality of the grammars is currently limited, but we have some ideas about a grammar combination facility 
* Performance is still poor at the moment, but see below for some notes about optimization experiments

# Ideas for future work

We currently have several ideas for future work. People interested are
very welcome to contact us.

* Heuristic-based modular error recovery

* Penalty-based ambiguous parsing for highly interactive scenario's.

* Automatic grammar transformation for use with penalty based ambiguous parsing.

* Compositionality of grammars: add a facility for combining grammars.

* Using compositional grammars: define a library of reusable non-terminals.

# Performance

The library's performance can currently be rather slow (parse time x50
for the grammar from our tutorial above compared against a manually
transformed UUParse parser). However, we have done some initial
experiments with the GHC compile flags [proposed by Magalhaes,
Holdermans and Jeuring][optg] where the factor of 50 was magically
reduced to about 3. With the grammar transformation additionally
performed at compile-time using Template Haskell, this factor was
further reduced to slightly over 2.

Maybe further optimization is possible using more fine-grained
inlining flags?

[optg]: http://portal.acm.org/citation.cfm?id=1706366

# Background

The grammar-combinators library is fundamentally based on an abstract
representation of context-free grammars with explicit recursion. The
ideas behind this library are described in the following paper
(being presented at [PADL 2011]). An accompanying technical report discusses
the implementation of some important grammar algorithms.

* Paper:
    * [Context-Free Grammar Combinators - A Better Model for Shallow Parser DSLs][Paper]: being presented at [PADL 2011]
* Technical report
    * [Context-Free Grammar Combinators - The Implementation of some Grammar Algorithms - Technical Report][TR]

[PADL 2011]: http://www.dcc.fc.up.pt/PADL-2011/
[Paper]: http://people.cs.kuleuven.be/~dominique.devriese/permanent/cfgc-final-PADL.pdf
[TR]: http://www.cs.kuleuven.be/publicaties/rapporten/cw/CW594.abs.html

# Links

* [Multirec]: a Haskell generic programming library which is used throughout the grammar-combinators library.
* [UUParse]: Applicative, error correcting parser combinator library, upon which part of our API is based.

[Multirec]: http://www.cs.uu.nl/wiki/GenericProgramming/Multirec
[UUParse]: http://www.cs.uu.nl/wiki/Center/UtrechtParserCombinators

# Authors

* Dominique Devriese &lt;dominique.devriese &lt;at&gt; cs &lt;dot&gt; kuleuven &lt;dot&gt; be&gt;


 </div>
