Building A File Parser
Last week, after reading this article - How to Write a Lexer in Go, I found that it is not so difficult to design a configuration file parser by this article’s mind-set. Then I tried to write a fluent-bit configuration parser, finally got this Fluent-Bit configuration parser for Golang.
In this article, I want to introduce how to parse Fluent-bit configuration
.conf file, and the thinking behind it.
Fluent-bit configuration format and schema
Here is a classic mode configuration of Fluent-bit, it includes two parts:
- Key/value pair
First of all, we need to define a struct which represents the Fluent-bit configuration file.
Once we have a struct, the next step is to parse tokens from the file and save their values into golang struct. We can copy the logic of the lexer to develop our fluentbit parser.
In a lexer program, the target characters which we want to parse out are called “Token”, Token is also the keyword that our parser program is searching for. A parser program will read characters in a file one by one, whenever it found a token, the parser saves the value between tokens into the final structure and go ahead.
Parse a single token
If we want to parse Section, we have to make the parser read characters one by one and stop at
[ character, which means the beginning of a Section. The parser must save the current state as
t_section and keep parser reading until
] character, the word between
] is the Section value we need to persist into go struct.
parser.parseString(), we have to read until the end of a value (for section, it’s
]), then return the value.
That’s all logic for parsing a section. To parse key/value pair is the same process, just note to make parser know which state it is and save values between whitespace or
\n, you can see the code at the github repo.
To parse a configuration file, we have to
- Defining token (key characters)
- Reading characters and looking for a token
- Saving current state to tell parser which struct the following characters belong