The primary construct in Grit is a pattern, which searches in a codebase for matching clauses and optionally executes a specified transformation. Patterns are also commonly referenced in side conditions, which Grit uses to execute rewrites. Patterns come in several forms.
Code snippets
The most basic form of a pattern is just a snippet in the target language surrounded by backticks (`
). For example:
`console.log('Hello, world!')`
console.log('Hello, world!')
To match on variations of a snippet, code snippets can include metavariables:
`console.log($message)`
Alternative syntax
Code snippets can alternatively be enclosed by double quotes prefixed with a language annotation. Language annotations correspond to one of Grit's supported target languages and limit the snippet to matching only code in that language.
The language annotation for JavaScript/TypeScript is js
. These two examples are equivalent:
`console.log('Hello, world!')`
js"console.log('Hello, world!')"
raw
output
The raw
prefix can be added to snippets to output them directly, bypassing any of Grit's built-in attempts at ensuring that the output code is valid. Use raw
for snippets that you always want to output as-is even if it generates invalid code.
For example:
`console.log($message)` => raw`if(' // I like broken code"`
console.log('Hello, world!');
if(' // I like broken code
Metavariables
Metavariables are used to create a binding to a specific part of the syntax tree. They are prefixed with a dollar sign and must be alphanumeric. Metavariables start with a letter and should use $lowercase_snake_case
. All metavariables must conform to the regex $[a-zA-Z_][a-zA-Z0-9_]*
.
Metavariables' default scope is the entire file, but this scope is usually restricted automatically by pattern auto-wrapping.
The bubble
clause can be used to limit the scope of a metavariable.
`console.log($foo)`
console.log('Hello, world!')
$match
, $filename
, $new_files
and $program
are reserved keywords used internally by Grit and should not be used as metavariable names. Likewise, metavariables starting with $grit_
are reserved for internal use.
Anonymous metavariables
When we do not care about the value of a metavariable, we can use the wildcard metavariable $_
. This allow is to match without being bound to a specific value.
`console.log($_)`
console.log($_)
Spread metavariables
By default, metavariables match a single node in the syntax tree. However, sometimes it is useful to match a variable number of nodes.
This can be done with the spread metavariable $...
. It matches 0 or more nodes, and can be used anywhere a metavariable can be used. Spread metavariables are anonymous, so they cannot be bound to a name.
`console.log($message, $...)` => `// Removed console.log: $message`
console.log('Hello, world!') console.log('Message 2', 'stuff')
// Removed console.log: 'Hello, world!' // Removed console.log: 'Message 2'
Explicit assignment
As illustrated in the console.log($message)
example, metavariables can be used without being declared, simply by replacing some of a code snippet with a metavariable meant to represent the substitution's part of the syntax tree.
It is also possible to explicitly declare and assign metavariables:
`const $logger = logger.$action($message)` where { $special_logger = js"$[action]Logger", $logger => $special_logger }
Tip: When using a metavariable in an output snippet, you can wrap its name in braces []
to distinguish the metavariable name from the snippet literal. (ex. $[name]Class
)
Rewrite operator =>
Once you have matched the relevant code, rewrites can be used to transform the matched pattern. They are written using the =>
operator, with a patterns on the left and right hand sides. The whole rewrite itself is a pattern and can be used anywhere a pattern is used.
`println` => `console.log`
println('Hello, world!');
console.log('Hello, world!');
Rewrites can contain metavariables.
So we can restrict the rewrite to only match println
calls, and no other references to println
.
`println($message)` => `console.log($message)`
const foo = println; println('Hello, world!');
const foo = println; console.log('Hello, world!');
Tip: Rewrites will work better if you keep them as specific as possible.
The right-hand side of a rewrite is a code snippet or metavariable bound to a code snippet.
Rewrites are just a special kind of pattern with a side effect. Anywhere that other kinds of patterns can be used, so can a rewrite. This means that you can "match a rewrite" to keep your GritQL concise:
engine marzano(0.1) language js `$test($_)` where { $test <: js"test.only" => js"test" }
test.only(() => { expect(true).toEqual(true) })
test(() => { expect(true).toEqual(true) })
Syntax-tree nodes
Syntax-tree nodes are one of the most important types in Grit. They are the only type whose syntax depends on the language you are targeting. Syntax-tree nodes represent unique instances in the syntax tree of the input code. For example, every distinct appearance of the string "foo"
in a single file is represented by an individual string()
node. Nodes can represent larger groupings of code as well: for example, every arrow function in JavaScript is represented by an arrow_function()
node.
For example, the following matches any augmented_assignment_expression
, regardless of its operator
(which can be one of =
, +=
, ...).
augmented_assignment_expression(operator = $op, left = $x, right = $v)
x = 1; x += 5;
The fields that we do not care about can be omitted, so the following patterns are equivalent:
augmented_assignment_expression(operator = $op) // same as augmented_assignment_expression(operator = $op, left = $_, right = $_)
Node structure
Each node has a type, which determines the node's available fields and their own types. Each child field of a node can be another syntax-tree node, a list of syntax-tree nodes, or a primitive value.
For example, "foo"
is a string
and has the following structure:
string(fragment = "foo")
We distinguish syntax-tree nodes, which represent nodes in the input code, from patterns, which are Grit constructs used to match syntax-tree nodes.
Primitives
Grit has several primitive types which are language-agnostic, unlike syntax-tree nodes.
Strings
Grit allows specifying strings outside of a code snippet in a language-agnostic way. Grit strings are surrounded by double quotes ("
). They can contain any character except for a double quote ("
) or a backslash (\
). To include one of these characters in a string, you can escape it with a backslash (\
).
In most cases, code snippets can be used interchangeably with Grit strings. Occasionally, however, Grit strings are useful for matching against exact strings instead of being AST-aware.
In general, any code which would be matched by a Grit string would also be matched by a code snippet, but not vice versa. This means that Grit strings can be used to apply transformations such as formatting changes that would not be possible with code snippets.
"Hello, world!" // string
Numbers
Grit has two number types, int
and double
. Functionally they can be used interchangeably, and you do not need to specify which type you want. The compiler will infer the type based on the context.
Numbers are useful for transforming code based on arithmetic operations:
js"multiply($x)" where { $y = $x * 2, $x => $y }
Lists
Grit has a single list type, analogous to lists or arrays in other languages. List are constructed with square brackets ([]
).
[`1`, `2`, `3`] // list
Lists have two primary uses:
- Accumulating items, which can be done with the
+=
operator. - Matching against a specific ordered list of AST nodes.
For example, the following pattern accumulates numbers with the +=
operator, then uses the built-in join function to format them for rewriting:
engine marzano(0.1) language js and { $new_numbers = [], $new_numbers += 3, $new_numbers += 4, $new_numbers = join(list = $new_numbers, separator = ", "), `const numbers = [$numbers]` => `const numbers = [$new_numbers]` }
const numbers = [1, 2];
const numbers = [3, 4];
Meanwhile, this pattern looks specifically for a function declaration followed by a return statement (and removes it, for illustrative clarity):
engine marzano(0.1) language js statement_block($statements) where { $statements <: [function_declaration(), return_statement()], $statements => . }
var sumToValue = function (x, y) { function Value(v) { this.value = v; } return new Value(x + y); }; var times = (x, y) => { return x * y; };
var sumToValue = function (x, y) { }; var times = (x, y) => { return x * y; };
Maps
Grit has an immutable map type. Maps are constructed with curly braces ({}
). Keys can contain letters and underscores, and values can be any valid Grit pattern.
Maps can be accessed using $key.value
syntax, where $key
is a metavariable bound to a map and value
is a key in the map. Accessors can be matched against like variables and used within a rewrite.
`const capital = $val` where { $capitals = { england: `london`, ours: $val }, $capitals.ours <: `paris`, $val => $capitals.england, }
const capital = paris;
const capital = london;
Sequential
patterns
Warning: sequential
is only supported at the top level of a Grit program. You cannot nest it inside other patterns or clauses.
By default, a Grit program consist of a single top-level pattern, which is matched against the input code.
In some cases, it is simpler to have the Grit program consist of multiple patterns, processed sequentially.
To do this, just wrap the individual steps in a top level sequential
clause.
For example:
language js sequential { `console.log($message)` => `console.warn($message)`, `console.warn($message)` => `console.info($message)` }
console.log('Hello, world!'); console.warn('Hello, world!');
console.info('Hello, world!'); console.info('Hello, world!');
sequential
is equivalent to writing several separate patterns and running each one in order, waiting for the previous to complete before starting the next.
Empty pattern
The dot (.
) can be used exclusively on the right-hand side of a rewrite to delete code. Semantically, it represents an empty syntax-tree node, so rewriting to a dot (=> .
) will remove the matched code.
Regular expressions
Strings prefixed with the letter r
are interpreted as regular expressions. Capture groups can be named by postfixing the regular expression with a list of metavariable names surrounded by parentheses ((
and )
).
For example, the following pattern will match the string "Hello, world!"
and bind the metavariable $name
to the string "world"
.
"Hello, world!" <: r"Hello, (.*)"($name) // $name is now bound to "world"
The patterns are interpreted using the Rust regular expression syntax.
file
and program
The file
and program
patterns are syntax-tree nodes which match the entire file or program within a file, respectively. They are useful if your pattern requires context across an entire file.
file
allows you to match and rewrite the file name as well as its body:
engine marzano(0.1) language js file($name, $body) where { $name => `$name.bak`, $body => `// This file was renamed with Grit!\n\n$body` }
program
is special in that it is available by default no matter what pattern you are writing. For example, the following pattern rewrites console.log
s to logger.log
s only if the file already contains a usage of logger
:
engine marzano(0.1) language js `console.log($log)` => `logger.log($log)` where { $program <: contains `logger` }