Parser combinaison #27
-
Hi, I'm not really sur how to combine multiple parsers. Let says for exemple we want to do a parser for If I want to do two parser one comment parser that could work on multiline comment. let comment = StartsWith("/*".utf8).take(PrefixUpTo("*/".utf8)).skip(Newline()) and one translation parser let translation = StartsWith("\"".utf8).take(PrefixUpTo("\"".utf8)) // 1 - key
.skip(PrefixThrough("=".utf8))
.skip(PrefixThrough("\"".utf8)).take(PrefixUpTo("\";".utf8))// 2 - value
.skip(PrefixThrough(";".utf8)) How to combine both parser above ? If I try to run the translation parser only on a file that have comment it don't quite work either. let trad = Skip(PrefixThrough("\"".utf8)).take(PrefixUpTo("\"".utf8)) // 1 - key
.skip(PrefixThrough("=".utf8))
.skip(PrefixThrough("\"".utf8)).take(PrefixUpTo("\";".utf8))// 2 - value
.skip(PrefixThrough(";".utf8)) This version don't quite work well either because even if it seems ok the work on multiple line, it can start taking the key on line 1 and then take the value on line 2 and this is bad 😔. I don't understand how to force a parser to be valid on one line only. Is it possible to combine a parser that work on one line and a parser that work on multiple line ? var inputLong = """
"sign_in_label" = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.";
/* %@ will be a price in the local currency. Examples: "$9.99", "9.99€" */
"monthly_price" = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam. %@";
"format" = "Lorem ipsum";
"Tagline" = "Sed do eiusmod tempor incididunt ut labore";
/* ipsum" refers to "dev profile" / "social media". please keep it to one word when it is possible */
"social" = "Social";
"special" = "Especial";
"""
struct Line {
let key: String
let value: String
let commentaire: String?
}
func parse(_ input : Substring.UTF8View) -> [Line]? {
let line = Optional.parser(of:StartsWith("/*".utf8).take(PrefixUpTo("*/".utf8))) // 0 - comment
.skip(PrefixThrough("\"".utf8)).take(PrefixUpTo("\"".utf8))// 1 - key
.skip(PrefixThrough("=".utf8))
.skip(PrefixThrough("\"".utf8)).take(PrefixUpTo("\";".utf8))// 2 - value
.skip(PrefixThrough(";".utf8))
let converted = line.map { (elem : (Optional<Substring.UTF8View>, Substring.UTF8View, Substring.UTF8View)) -> Line in
let comment = (elem.0 == nil) ? nil : String(elem.0!)
return Line(
key: String(elem.1)!,
value: String(elem.2)!,
commentaire: comment?.trimmingCharacters(in: .whitespaces)
)
}
let file = Many(converted, separator: Newline())
let parsedContent = file.parse(input)
return parsedContent.output
}
let content = parse(inputLong[...].utf8)
for line in content ?? [] {
print("key: \(line.key)")
print("value: \(line.value)")
// print("comment: \(line.value.commentaire)")
print("---\n")
} output
The second and the fifth are bad. ConclusionIn the exemple above I don't see how to separate the I'm all ears if you have some idea. Thanks, |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
The only way I see I can combine them is by doing one parser. let line = Optional.parser(of:StartsWith("/*".utf8).take(PrefixUpTo("*/".utf8))) // 0 - comment
.skip(PrefixThrough("\"".utf8)).take(PrefixUpTo("\"".utf8))// 1 - key
.skip(PrefixThrough("=".utf8))
.skip(PrefixThrough("\"".utf8)).take(PrefixUpTo("\";".utf8))// 2 - value
.skip(PrefixThrough(";".utf8)) |
Beta Was this translation helpful? Give feedback.
-
@mackoj You're close! I think the problem is that your input has multiple new lines before the comments and your separator is One other suggestion would be to avoid let comment = StartsWith("/*".utf8).take(PrefixUpTo("*/".utf8)).skip(StartsWith("*/".utf8))
let literal = StartsWith("\"".utf8).take(PrefixUpTo("\"".utf8)).skip(StartsWith("\"".utf8))
let translation = Skip(Whitespace())
.take(Optional.parser(of: comment))
.skip(Whitespace())
.take(literal)
.skip(Whitespace())
.skip(StartsWith("=".utf8))
.skip(Whitespace())
.take(literal)
.skip(Whitespace())
.skip(StartsWith(";".utf8))
.skip(Whitespace()) |
Beta Was this translation helpful? Give feedback.
@mackoj You're close! I think the problem is that your input has multiple new lines before the comments and your separator is
Newline()
, which will only attempt to parse a single one. Changing the separator toMany(Newline())
might fix things, or even justWhitespace()
.One other suggestion would be to avoid
PrefixThrough
andPrefixUpTo
when usingskip
with parsers, since in this case it may have made things more difficult to debug. Instead you could be very explicit in what you want to match. This is just a sketch so not sure if it's exactly what you want, but something like this: