Pages: 328 Published: January 2013 ISBN: 9781934356999
In Print
The Definitive ANTLR 4 Reference
by Terence Parr
Programmers run into parsing problems all the time. Whether it’s a data
format like JSON, a network protocol like SMTP, a server configuration
file for Apache, a PostScript/PDF file, or a simple spreadsheet macro
language—ANTLR v4 and this book will demystify the process. ANTLR v4
has been rewritten from scratch to make it easier than ever to build
parsers and the language applications built on top. This completely
rewritten new edition of the bestselling Definitive ANTLR Reference
shows you how to take advantage of these new features.
“Parr’s clear writing and lighthearted style make it a pleasure to
learn the practical details of building language processors.”
Dan Bornstein - Designer of the Dalvik VM for Android
“ANTLR is an exceptionally powerful and flexible tool for parsing
formal languages. At Twitter, we use it exclusively for query parsing in
our search engine. Our grammars are clean and concise and the generated
code is efficient and stable. This book is our go-to reference for ANTLR
v4—engaging writing, clear descriptions and practical examples all in
one place.”
Samuel Luckenbill - Senior Manager of Search Infrastructure - Twitter,
Inc.
Build your own languages with ANTLR v4, using ANTLR’s new advanced
parsing technology. In this book, you’ll learn how ANTLR automatically
builds a data structure representing the input (parse tree) and
generates code that can walk the tree (visitor). You can use that
combination to implement data readers, language interpreters, and
translators.
You’ll start by learning how to identify grammar patterns in language
reference manuals and then slowly start building increasingly complex
grammars. Next, you’ll build applications based upon those grammars by
walking the automatically generated parse trees. Then you’ll tackle some
nasty language problems by parsing files containing more than one
language (such as XML, Java, and Javadoc). You’ll also see how to take
absolute control over parsing by embedding Java actions into the
grammar.
You’ll learn directly from well-known parsing expert Terence Parr, the
ANTLR creator and project lead. You’ll master ANTLR grammar construction
and learn how to build language tools using the built-in parse tree
visitor mechanism. The book teaches using real-world examples and shows
you how to use ANTLR to build such things as a data file reader, a JSON
to XML translator, an R parser, and a Java class->interface extractor.
This book is your ticket to becoming a parsing guru!
You can find out more about ANTLR 4 in this
interview
with Terence Parr.
What You Need:
ANTLR 4.0 and above. Java development tools. Ant build system optional
(needed for building ANTLR from source)
This release fixes a number of important code errors and a number of little typos. The two-stage parsing discussion now conforms to current best practices. The Cymbol.g4 grammar now allows only identifiers as array names. The
JSON grammar did not allow floating-point numbers whose fractional part started with a 0. A code example referenced listeners/XML.stg, which was missing. There were some stale
URL references. Some of the images early in the book had gray boxes that were shifted out of place. In the section on tree listeners, the code sample for PropertyFileBaseVisitor was wrong. I added a note to explicitly say that combined grammars can import other combined grammars.
Check out the book’s
[errata](https://pragprog.com/titles/tpantlr2/errata) page for more details.
2013/01/16
P1.0
First printing.
2012/12/13
B4.0
Indexing and copy edit are complete. Now it’s on to layout and then the printer.
2012/11/12
B3.0
The book is now final draft complete and heading to production. I have gone through all of the technical reviews and made widespread changes. In response to concerns about the speed of the initial material, I have tried to add more details and also split out the starter project into its own chapter. This leaves the big picture chapter smaller and easier to chew on.
I went through the entire book, verifying that the examples compile, perform correctly, and match the output in the book.
There were lots and lots of little things in the errata that got fixed in this version.
I have added a number of improvements to the discussion of wildcards and nongreedy loops.
ANTLR v4.0b3 coincides with this beta 3 book.
Terence Parr is a professor of computer science and graduate
program
director at the University of San Francisco. He is the creator of the
ANTLR parser generator and StringTemplate engine, and also has broad
industrial experience related to language implementation. Terence
holds a Ph.D. in Computer Engineering from Purdue University and was a
postdoctoral fellow at the Army High-Performance Computing Research
Center at the University of Minnesota.