Migrating from Farkle 6.x to Farkle 7.0
Farkle 7.0 is a complete rewrite of the library, focusing on improved performance and extensibility. While migration to the new version is seamless for most mainline scenarios, there are some breaking changes to be aware of. This guide lists several of these changes, and helps you address them.
The happy path — fixing obsoletion diagnostics
As soon as you update Farkle's NuGet package version to 7.0, you will see some obsoletion warnings or errors in your code. Your first step should be to resolve these, by following the respective obsoletion messages.
Usually, this will be the majority of the work you need to do. After you do that, you should review the changes in the list below, and update your code accordingly.
Summary of changes
The following changes were made in Farkle 7.0 that might affect compatibility with Farkle 6.x.
General changes
- Farkle was rewritten in C#. F# remains a supported language, and the F# API is provided through a source file automatically included by the NuGet package.
- Support for .NET Standard 2.0 was removed, and the minimum target framework was increased to .NET Standard 2.1. This is expected to change when Unity adds support for CoreCLR, after which Farkle will target only .NET according to the .NET support lifecycle.
- Keeping .NET Standard 2.0 support for a little longer is a possibility, depending on ecosystem developments and community feedback. If you need to run Farkle on a .NET Standard 2.0-compatible framework, please open an issue.
- Instead of a discriminated union, parser and builder errors have a type of
object
, and can be converted to a human-readable message by callingToString()
. For certain error types carrying additional information, you can try casting the error object to one of the types defined in the Farkle.Diagnostics namespace. There are some active patterns to help F# users.
Changes to the builder API
- The type
DesigntimeFarkle<T>
was split into two types; IGrammarBuilder<T> and IGrammarSymbol<T>, where the latter inherits from the former. The untypedDesigntimeFarkle
type got an equivalent treatment. Grammar builders cannot be used as members of a production, while grammar symbols can. This is used to enforce that customization options that apply to the whole grammar must be set at the top-most symbol that gets built. - The ability to rename symbols was removed. It had been introduced in Farkle 6 because the precompiler required a unique name for each precompiled grammar in an assembly. In Farkle 7.0, the precompiler was redesigned to not have this requirement, leaving no valid use cases for symbol renaming. If you want to set an informative name for the whole grammar, you can use the WithGrammarName extension methods.
- Grammars are case-sensitive by default. You can override this behavior by using the already existing CaseSensitive extension methods.
- In grammars with the NewLine symbol that have not disabled automatic whitespace handling, new lines in unexpected places are ignored by default and do not produce a syntax error. You can customize this behavior by using the newly introduced NewLineIsNoisy extension methods.
- Each grammar can have up to one OperatorScope, which must be set at the top-most symbol that gets built.
- In the F# API, fusers can have up to 6 parameters, instead of 16. If you need more parameters, you have to either refactor your grammar, or write the production with the C# API.
- In the F# API's
Farkle.Builder.Terminals
module, thegenericSigned
,genericUnsigned
, andgenericReal
functions are available only when targeting a supported version of .NET. Using them when targeting earlier frameworks will result in a compile error. - In the F# API's
Terminals.stringEx
function, theallowEscapeUnicode
parameter was removed. If you want to support escaping Unicode characters, include theu
character in theescapeChars
parameter. - Regex.Any was updated to match exactly any character, and no longer has lower priority than other character classes.
- The
Farkle.Builder.PredefinedSets
type was removed. If you were using it, you have to specify the character set yourself. - In transformers, the first parameter has a type of ParserState, and is passed by reference. This is not expected to cause any problems for F# due to its type inference, but in C#, the parameter declaration in the transformer lambda will be more verbose. If your transformer is
(_, data) => …
:- Starting with C# 14, you can write it as
(ref _, data) => …
. - However in earlier versions of C#, you have to write the types of both parameters, as
(ref ParserState _, ReadOnlySpan<char> data) => …
.
- Starting with C# 14, you can write it as
Changes to the precompiler
Note
The precompiler is not available in early preview versions of Farkle 7.0. This section will be updated once the precompiler gets written.
Changes to string regexes
The language of string regexes has been updated to be more in line with standard regex syntax. Specifically:
- Whitespace is no longer ignored. If you were using whitespace to make your regex patterns more readable, you have to use an alternative mechanism for this purpose, such as string concatenation.
- Escaping characters by putting them in single quotes is no longer supported. You have to use a backslash (
\
) for this purpose instead. - The behavior of the
.
pattern was updated according to the new behavior of Regex.Any described above. - Due to the removal of
PredefinedSets
, the\p{<set>}
syntax is no longer supported and will cause a builder error if used. If you were using this syntax, you have to specify the character set yourself. - Stacking quantifiers is no longer supported. For example, if you want to match either zero, or between five and ten occurrences of
x
, you can no longer writex{5,10}?
, and have to write(x{5,10})?
instead.
Changes to the parser API
- The
RuntimeFarkle<T>
type was renamed to CharParser<T>. - The
PostProcessor<T>
type was renamed to ISemanticProvider<TChar, T> and can support other character types, for future extensibility. - Instead of an F#
Result
, the parser functions return values of type ParserResult<T>. In the F# API, you can use the|ParserSuccess|ParserError|
active pattern to match a parser result, or theParserResult.toResult
function to convert it to an F#Result
. - The low-level tokenizer API has substantially changed, to the extent that migrating a tokenizer is tantamount to rewriting it. Due to a lack of known third-party uses, no specific guidance will be given. If you have a grammar using a custom tokenizer, you can take a look at the tokenizer design document, or a sample indent-based grammar, to get an idea on how to write a custom tokenizer in Farkle 7.0.
- The
Farkle.AST
type was removed. - The
Farkle.Position
type was renamed to TextPosition.
Changes to the grammars API
- The types in the grammars API were moved from the
Farkle.Grammar
namespace, to the Farkle.Grammars namespace. - The grammar format was overhauled to support extensibility and reduced cold start times by directly reading it from memory.
- Reading EGTneo grammar files produced by Farkle 6.x is not supported. Reading grammar files produced by GOLD Parser is remains supported, by converting them to the new format under the hood.
- The representation of token symbols produced by a DFA has been streamlined. Instead of a strongly-typed choice between different symbol types, all kinds of symbols (terminals, comments, group boundaries, etc.) are now represented by the TokenSymbol type.
- The shape of the grammars API has substantially changed. Due to a lack of third-party use cases, no specific guidance will be given, but migrating to the new API is expected to be pretty straightforward. If you encounter any difficulties with migrating, feel free to open an issue.
Changes to the templating language
- The API shape of the
grammar
variable was updated to match the changes to the grammars API.- The
productions_groupped
helper variable was removed; getting the productions of a nonterminal is now built-in, and available with the Nonterminal.Productions property.
- The
- The
fmt
function was removed.