SQUARE Symsim is a test-bed for implementing reinforcement learning algorithms, formalizing their correctness properties, and testing them. It is implemented in Scala 3, in purely functional style, and uses property-based testing.
There is no installation or release yet. See below in Adding a new agent how to clone, branch, and run the code.
The implementation is quite memory hungry right now, so we recommend the following sbt setup if you run out of memory:
export SBT_OPTS="-Xmx3G -XX:+UseG1GC -Xss2M"Place this in your .bashrc or execute in the current shell, just
before starting sbt.
So far discrete (exact) Q-Learning and SARSA are implemented, along with a bunch of simple examples.
-
Git clonethe repo orgit pull(in this case you can skip step 2) to have the fresh version -
Change directory to the cloned repo:
cd symsim -
Create a new branch (the repo is configured not to allow to push to main). Let our example be tic-tac-toe
git checkout -b tic-tac-toe
-
Create a new package in
src/main/scala/symsim/examples/concrete/. The existing one is calledbraking, let's call the new onetictactoemkdir -pv src/main/scala/symsim/examples/concrete/tictactoe
The package goes under
examplesandconcretefor "concrete execution RL". -
Inside the new directory create a file
TicTacToe.scala.cp -iv src/main/scala/symsim/examples/concrete/braking/Car.scala src/main/scala/symsim/examples/concrete/tictactoe/TicTacToe.scala edit src/main/scala/symsim/examples/concrete/tictactoe/TicTacToe.scalaAdjust the name of the package object from braking to
tictactoe. Then change the four types (both names and definitions) to whatever makes sense for TicTacToe. For instances createTicState- to represent the state of the gameTicObservableState- this might be just a renaming because the Tic Tac Toe state space is finiteTicAction- possible moves -
Implement the TicTacToe agent.
Edit this file from top eliminating the Car example and introducing the TicTacToe example. There are two parts: in the class in the top we give all the logics of the agent, and in the instances/constraints part in the bottom we use the type system to prove that our types have all the necessary properties for the machinery to work. It might be useful to consult the interface definition (which also has comments at plenty):
src/main/scala/symsim/Agent.scala. -
Working with git and PRs.
Throughout the process you can commit as normally. The first time you try to push, observe what git tells you to do, to push to the remote branch. Follow the instruction, and then read the message from git again after the succesful push, to find the link to create a pull request. Open that link and create a pull request
Adding Tic Tac Toe. You can mark it as work in progress (create a 'draft pull request' instead ofpull request) if you are not done. After this you can continue pushing as normally from your branch, if you make new commits, and others in the project, will be able to track and discuss your progress easily. -
Compiling
To compile your code you can open
sbtin the root directory (sbtis the only tool you have to install, you do not need to installscala):sbt ...>compile -
Running the learning
There is a corresponding test tree (to the
mainsource tree). Underconcrete/examples/braking/you will find the fileExperiments.scalathat shows how the braking car learning is executed. So far, we disguise it as a test. You can copy this file to the corresponding directory fortictactoeand adjust it to instantiate the tic-tac-toe learning.
-
Create a new branch (the repo is configured not to allow to push to main). Let continue with tic-tac-toe example.
git checkout -b tic-tac-toe-tests
-
Create a new package in
src/test/scala/symsim/examples/concrete/for the new agent.mkdir -pv src/test/scala/symsim/examples/concrete/tictactoe
-
Inside the new directory create a file
TicTacToeSpec.scala.cp -iv src/test/scala/symsim/examples/concrete/braking/CarSpec.scala src/test/scala/symsim/examples/concrete/tictactoe/TicTacToeSpec.scala edit src/test/scala/symsim/examples/concrete/tictactoe/TicTacToeSpec.scalaAdjust the name of the package object from braking to
tictactoe, and import the new agent instancesimport TicTacToe.instances. -
Then, you can add your preferred tests by just adding the following line for each test and replacing question marks with the boolean property.
property ("TITLE THAT YOU PREFER TO SHOW IN THE TERMINAL") = ??? -
Test
To test your code you can open
sbtin the root directory:sbt ...>testOnly symsim/examples/concrete/braking/TicTacToeSpec
Symsim is developed at the SQUARE group at IT University of Copenhagen, and at the SIRIUS Centre of University of Oslo. The work is financially supported by the Danish DIREC initiative, under a bridge project Verifiable and Safe AI for Autonomous Systems.