Skip to content



Mathy uses machine learning (ML) to choose which actions to apply to which nodes in an expression tree.

It picks and takes actions in a loop to accomplish a desired task.

Specifically, Mathy uses Tensorflow 2.x with the Keras API to implement a reinforcement learning environments platform.


Mathy uses machine learning in a few ways, and has the following features:

Text Preprocessing

Mathy processes an input problem by parsing its text into a tree, converting that tree into a sequence features for each node in the tree, combining those features with the current environment state, and embedds them into a variable length sequence of fixed-dimension embeddings.

Text to Tree

A problem text is encoded into tokens, then parsed into a tree that preserves the order of operations while removing parentheses and whitespace. Consider the tokens and tree that result from the input: -3 * (4 + 7)


- 16 3 2 * 32 ( 512 4 2 + 8 7 2 ) 1024 16384


-3 4 7 + *

Observe that the tree representation is more concise than the tokens array because it doesn't have nodes for hierarchical features like parentheses.

Converting text to trees is accomplished with the expression parser:

Open Example In Colab

from typing import List

from mathy_core import ExpressionParser, MathExpression, Token, VariableExpression

problem = "4 + 2x"
parser = ExpressionParser()
tokens: List[Token] = parser.tokenize(problem)
expression: MathExpression = parser.parse(problem)
assert len(expression.find_type(VariableExpression)) == 1

Tree to List

Rather than try to feed expression trees into a machine learning model, we traverse them to produce node lists.

tree list ordering

You might have noticed the features from the previous tree are not expressed in the natural order that we might read them. As observed by Lample and Charton trees must be visited in an order that preserves the order-of-operations, so the model can pick up on the hierarchical features of the input.

For this reason we visit trees in pre order for serialization.

Converting math expression trees to lists is done with a helper:

Open Example In Colab

from typing import List
from mathy import ExpressionParser, MathExpression

parser = ExpressionParser()
expression: MathExpression = parser.parse("4 + 2x")
nodes: List[MathExpression] = expression.to_list()
# len([4,+,2,*,x])
assert len(nodes) == 5

Lists to Observations

Math turns a list of math expression nodes into a feature list that captures characteristics of the input. Specifically, mathy converts a node list into two lists, one with node types and another with node values:

* 0.0 5 -3 -3.0 10 + 0.0 3 4 4.0 10 7 7.0 10

  • The first row is the input token characters stripped of whitespace and parentheses.
  • The second row is the sequence of floating point node values for the tree, with each non-constant node represented by a mask value.
  • The third row is the node type integer representing the class of the node in the tree.

While feature lists may be directly passable to a ML model, they don't include any information about the state of the problem over time. To work with information over time, mathy agents draw extra information from the environment when building observations. This extra information includes:

  • Environment Problem Type: environments all specify an environment namespace that is converted into a pair of hashed string values using different random seeds.
  • Episode Relative Time: each observation is able to see a 0-1 floating point value that indicates how close the agent is to running out of moves.
  • Valid Action Mask: mathy gives weighted estimates for each action at every node. If there are 5 possible actions, and 10 nodes in the tree, there are up to 50 possible actions to choose from. A same sized (e.g. 50) mask of 0/1 values are provided so that the model can mask out nodes with no valid actions when returning probability distributions.

Mathy has utilities for making the conversion:

Open Example In Colab

from mathy import (

env: MathyEnv = envs.PolySimplify()
state: MathyEnvState = env.get_initial_state()[0]
observation: MathyObservation = env.state_to_observation(state)

# As many nodes as values
assert len(observation.nodes) == len(observation.values)
# Mask is number of nodes times number of actions
assert len(observation.mask) == len(observation.nodes) * env.action_size

Last update: July 24, 2020