Build A Dynamic Excel-Like Formula Engine

Alex Johnson
-
Build A Dynamic Excel-Like Formula Engine

Hey there, tech enthusiasts! Let's dive into creating a bidirectional formula engine for Excel-like tables. Ever wished you could change any cell in a formula and have the others magically update? That's precisely what we're aiming for! This article will walk you through the design, implementation, and considerations for building such a dynamic system. Get ready to explore the fascinating world of two-way calculations!

The Core Idea: Two-Way Calculations

Imagine you're working with the classic physics formula: distance = speed * time. In a standard spreadsheet, you'd input values for speed and time, and the distance would calculate automatically. But what if you want to change the distance and have the time recalculate? That's the essence of a bidirectional engine. It allows you to modify any parameter, and the system intelligently adjusts the others based on the formula and a predefined calculation order. This approach enhances user experience and provides more flexible data manipulation. With the power of this bidirectional formula engine, calculations will be very dynamic.

Understanding the Motivation: Why Bidirectional?

The primary motivation is to provide users with a more intuitive and flexible way to interact with formulas, especially in scenarios involving physical or mathematical equations. Standard spreadsheets require the user to understand which cells are inputs and which are outputs. Bidirectional calculations eliminate this need, allowing users to change any variable and let the system handle the rest. This is particularly useful in scientific or engineering applications, where users often need to experiment with different parameters and observe their effects. The dynamic nature of these calculations allows for a more interactive approach.

Illustrative Example of Use

Let's consider a simple example to illustrate the concept. We'll use a configuration written in YAML to define our formulas and the order of calculations.

formulas:
  - name: "Kinematics"
    formula: "final - initial = speed * time"
    recalc_order: [time, speed, final, initial]

Here's how the system would behave:

  • Initial State: initial = 0, speed = 10, time = 5, final = 50
  • User Changes speed to 20:
    • The system recalculates time (the first variable in recalc_order that's not the one that changed)
    • time = (final - initial) / speed = 50 / 20 = 2.5
  • User Changes final to 100:
    • The system recalculates time
    • time = (100 - 0) / 20 = 5
  • User Changes time to 10:
    • The system recalculates speed (the next variable in recalc_order)
    • speed = (final - initial) / time = 100 / 10 = 10

This example shows the power of a bidirectional system. The order of calculation is paramount to the final result.

Technical Requirements: Building Blocks of the Engine

To make this vision a reality, we need to consider some functional and non-functional requirements.

Functional Requirements

  1. Formula Parsing from YAML:

    • The system should read the configuration from a YAML file at startup.
    • It must support the basic arithmetic operators: +, -, *, /.
    • Include the following functions: abs(), sin(), cos(), tan(), sqrt(), pow()
    • Support parentheses for grouping expressions.
  2. Formula Validation During Initialization:

    • Linearity Checks:
      • Each variable can appear only once in the formula.
      • Variables cannot be multiplied (e.g., area = width * height is not allowed).
      • Division by a variable is also not allowed (e.g., ratio = x / y is not allowed).
      • No exponentiation of variables (e.g., square = x * x is not allowed).
    • Function Restrictions:
      • Functions can only operate on constants or constant expressions (e.g., sin(3.14) is okay).
      • Functions cannot take variables as arguments (e.g., y = sin(x) is not allowed).
    • Mandatory Fields:
      • The recalc_order must contain all variables used in the formula.
  3. Solver Generation:

    • For each variable, the system must symbolically solve the equation concerning that variable.
    • For example, c = a + b would generate three functions:
      • a = c - b
      • b = c - a
      • c = a + b
    • These solvers are then compiled into Func<Dictionary<string, double>, double> for runtime calculations.
  4. Runtime API:

    public class BidirectionalFormula
    {
        public BidirectionalFormula(string formula, List<string> recalcOrder);
    
        public void SetValue(string variable, double value);
        public double GetValue(string variable);
    
        public event Action<string, double> OnValueChanged;
    }
    
  5. Recalculation Logic:

    • When SetValue(changedVariable, value) is called:
      1. Save the new value.
      2. Select the recalculation target: the first variable in recalc_order that isn't the changedVariable.
      3. Call the corresponding solver with current values.
      4. Trigger the OnValueChanged event for the updated variable.

Non-Functional Requirements

  • Target Framework: .NET Framework 4.8.
  • Parsing: Occurs only once at application startup.
  • Runtime: Involves only the substitution of values into pre-generated functions.
  • Data Storage: All values are stored using the double data type.
  • Performance: Recalculation should take under 100ms (given an estimated load of ~1 recalculation per second).
  • Variable Limit: A maximum of 10 variables per formula.

Technical Details: Implementation Insights

Let's delve into the technical aspects of building this engine.

Library Choice

  • Main Option: AngouriMath 1.4.2 (NuGet)
    • Symbolic equation solving.
    • Mathematical expression parsing.
    • Formula simplification.
  • Alternative: MathNet.Symbolics (in case of .NET Framework 4.8 compatibility issues).

Formula Examples: Validation and Testing

✅ Valid Formulas:

sum = a + b
distance = speed * time_value  // time_value is a single variable
result = a + b * 2 - c / 3
velocity = (final - initial) / time
area = side * 4  // 4 is a constant
ratio = abs(a - b) / c  // if c is in recalc_order before a,b

❌ Invalid Formulas:

area = width * height          // variable multiplication
ratio = x / y                  // division by a variable
power = x * x                  // variable used twice
result = sin(angle)            // function of a variable
force = mass * acceleration    // variable multiplication
speed = abs(velocity)          // if velocity is in recalc_order

Error Handling: Dealing with Issues

Initialization Errors (FormulaInitializationException):

  • Non-linear formula.
  • Failed symbolic equation solving.
  • recalc_order missing variables.
  • A variable inside abs() is in recalc_order.

Runtime Errors (FormulaComputeException):

  • Division by zero (e.g., speed = distance / time when time = 0).
  • Negative square root (e.g., x = sqrt(y) when y < 0).
  • NaN/Infinity results from computations.

Prioritization and Design Choices: Guiding the Engine

This section outlines our primary decisions for the project.

Approach to Priorities: Recalculation Order

Originally, we considered a few options:

  1. State-Driven: Recalculate the

You may also like