|
| 1 | +# Instructions |
| 2 | + |
| 3 | +Parsing a Smart Game Format string. |
| 4 | + |
| 5 | +[SGF][sgf] is a standard format for storing board game files, in particular go. |
| 6 | + |
| 7 | +SGF is a fairly simple format. An SGF file usually contains a single |
| 8 | +tree of nodes where each node is a property list. The property list |
| 9 | +contains key value pairs, each key can only occur once but may have |
| 10 | +multiple values. |
| 11 | + |
| 12 | +The exercise will have you parse an SGF string and return a tree structure of properties. |
| 13 | + |
| 14 | +An SGF file may look like this: |
| 15 | + |
| 16 | +```text |
| 17 | +(;FF[4]C[root]SZ[19];B[aa];W[ab]) |
| 18 | +``` |
| 19 | + |
| 20 | +This is a tree with three nodes: |
| 21 | + |
| 22 | +- The top level node has three properties: FF\[4\] (key = "FF", value |
| 23 | + = "4"), C\[root\](key = "C", value = "root") and SZ\[19\] (key = |
| 24 | + "SZ", value = "19"). (FF indicates the version of SGF, C is a |
| 25 | + comment and SZ is the size of the board.) |
| 26 | + - The top level node has a single child which has a single property: |
| 27 | + B\[aa\]. (Black plays on the point encoded as "aa", which is the |
| 28 | + 1-1 point). |
| 29 | + - The B\[aa\] node has a single child which has a single property: |
| 30 | + W\[ab\]. |
| 31 | + |
| 32 | +As you can imagine an SGF file contains a lot of nodes with a single |
| 33 | +child, which is why there's a shorthand for it. |
| 34 | + |
| 35 | +SGF can encode variations of play. Go players do a lot of backtracking |
| 36 | +in their reviews (let's try this, doesn't work, let's try that) and SGF |
| 37 | +supports variations of play sequences. For example: |
| 38 | + |
| 39 | +```text |
| 40 | +(;FF[4](;B[aa];W[ab])(;B[dd];W[ee])) |
| 41 | +``` |
| 42 | + |
| 43 | +Here the root node has two variations. The first (which by convention |
| 44 | +indicates what's actually played) is where black plays on 1-1. Black was |
| 45 | +sent this file by his teacher who pointed out a more sensible play in |
| 46 | +the second child of the root node: `B[dd]` (4-4 point, a very standard |
| 47 | +opening to take the corner). |
| 48 | + |
| 49 | +A key can have multiple values associated with it. For example: |
| 50 | + |
| 51 | +```text |
| 52 | +(;FF[4];AB[aa][ab][ba]) |
| 53 | +``` |
| 54 | + |
| 55 | +Here `AB` (add black) is used to add three black stones to the board. |
| 56 | + |
| 57 | +All property values will be the [SGF Text type][sgf-text]. |
| 58 | +You don't need to implement any other value type. |
| 59 | +Although you can read the [full documentation of the Text type][sgf-text], a summary of the important points is below: |
| 60 | + |
| 61 | +- Newlines are removed if they come immediately after a `\`, otherwise they remain as newlines. |
| 62 | +- All whitespace characters other than newline are converted to spaces. |
| 63 | +- `\` is the escape character. |
| 64 | + Any non-whitespace character after `\` is inserted as-is. |
| 65 | + Any whitespace character after `\` follows the above rules. |
| 66 | + Note that SGF does **not** have escape sequences for whitespace characters such as `\t` or `\n`. |
| 67 | + |
| 68 | +Be careful not to get confused between: |
| 69 | + |
| 70 | +- The string as it is represented in a string literal in the tests |
| 71 | +- The string that is passed to the SGF parser |
| 72 | + |
| 73 | +Escape sequences in the string literals may have already been processed by the programming language's parser before they are passed to the SGF parser. |
| 74 | + |
| 75 | +There are a few more complexities to SGF (and parsing in general), which |
| 76 | +you can mostly ignore. You should assume that the input is encoded in |
| 77 | +UTF-8, the tests won't contain a charset property, so don't worry about |
| 78 | +that. Furthermore you may assume that all newlines are unix style (`\n`, |
| 79 | +no `\r` or `\r\n` will be in the tests) and that no optional whitespace |
| 80 | +between properties, nodes, etc will be in the tests. |
| 81 | + |
| 82 | +[sgf]: https://en.wikipedia.org/wiki/Smart_Game_Format |
| 83 | +[sgf-text]: https://www.red-bean.com/sgf/sgf4.html#text |
0 commit comments