in

Why I love Rust for tokenising and parsing

Why I love Rust for tokenising and parsing

I am currently writing a analysis tool for Sql: sqleibniz, specifically for the sqlite
dialect.

The goal is to perform static analysis for sql input, including: syntax
checks, checks if tables, columns and functions exist. Combining this with an
embedded sqlite runtime and the ability to assert conditions in this runtime,
creates a really great dev experience for sql.

Furthermore, I want to be able to show the user high quality error messages
with context, explainations and the ability to mute certain diagnostics.

This analysis includes the stages of lexical analysis/tokenisation, the
parsing of SQL according to the sqlite documentation1 and
the analysis of the resulting constructs.

After completing the static analysis part of the project, I plan on writing a
lsp server for sql, so stay tuned for that.

In the process of the above, I need to write a tokenizer and a parser – both
for SQL. While I am nowhere near completion of sqleibniz, I still made some
discoveries around rust and the handy features the language provides for
developing said software.

MacrosMacros work different in most languages. However they are used for mostly the
same reasons: code deduplication and less repetition.

Abstract Syntax Tree NodesA node for a statement in sqleibniz implementation is defined as follows:

1
2#[derive(Debug)]
3/// holds all literal types, such as strings, numbers, etc.
4pub struct Literal {
5 pub t: Token,
6}
Furthermore all nodes are required to implement the Node-trait, this trait
is returned by all parser functions and is later used to analyse the contents
of a statement:

1pub trait Node: std::fmt::Debug {
2 fn token(&self) -> &Token;
3}
Code duplicationThus every node not only has to be defined, but an implementation for the
Node-trait has to be written. This requires a lot of code duplication and
rust has a solution for that.

I want a macro that is able to:

define a structure with a given identifier and a doc commentadd arbitrary fields to the structuresatisfying the Node trait by implementing fn token(&self) -> &TokenLets take a look at the full code I need the macro to produce for the
Literal and the Explain nodes. While the first one has no further fields
except the Token field t, the second node requires a child field with a
type.

1#[derive(Debug)]
2/// holds all literal types, such as strings, numbers, etc.
3pub struct Literal {
4 /// predefined for all structures defined with the node! macro
5 pub t: Token,
6}
7impl Node for Literal {
8 fn token(&self) -> &Token {
9 &self.t
10 }
11}
12
13
14#[derive(Debug)]
15/// Explain stmt, see: https://www.sqlite.org/lang_explain.html
16pub struct Explain {
17 /// predefined for all structures defined with the node! macro
18 pub t: Token,
19 pub child: Option Parser Parser

Report

What do you think?

Newbie

Written by Mr Viral

Leave a Reply

Your email address will not be published. Required fields are marked *

GIPHY App Key not set. Please check settings

Ham Radio 101: What is WSPR?

Ham Radio 101: What is WSPR?

After decades, FDA moves to pull ineffective decongestant off shelves

After decades, FDA moves to pull ineffective decongestant off shelves