表达式解析器是一个基于递归下降解析技术的 Java 项目,支持标量、向量、矩阵的数学表达式解析与计算。项目还提供了一个交互式 REPL(Read-Eval-Print Loop)终端,方便用户实时输入和计算表达式。
本项目采用经典的编译器前端架构,将表达式解析分为四个阶段:
输入: "3 + 4 * 2"
阶段1: 词法分析 (Lexer) 阶段2: 语法分析 (Parser)
┌───────────────────┐ ┌───────────────────────────────────────┐
│ NUMBER(3) │ │ expr │
│ PLUS(+) │ │ └── addExpr │
│ NUMBER(4) │ ────────────▶ │ ├── term │
│ MULTIPLY(*) │ Token序列 │ │ └── unary │
│ NUMBER(2) │ │ │ └── power │
│ EOF │ │ │ └── implicitMul │
└───────────────────┘ │ │ └── postfix │
│ │ └── factor │
│ │ └── 3 │
│ ├── PLUS │
│ └── term │
│ ├── unary │
│ │ └── power │
│ │ └── implicitMul │
│ │ └── postfix │
│ │ └── factor│
│ │ └── 4 │
│ ├── MULTIPLY │
│ └── unary │
│ └── power │
│ └── implicitMul │
│ └── postfix │
│ └── factor│
│ └── 2 │
└───────────────────────────────────────┘
阶段3: 构建 AST 阶段4: 表达式求值 (Eval)
┌───────────────────┐ ┌─────────────────────────────────────┐
│ BinaryOpNode(+) │ │ BinaryOpNode(+) │
│ / \ │ │ left = 3.eval() = 3 │
│ Number BinaryOp│ ────────────▶ │ right = *.eval() │
│ Node(3) Node(*) │ │ left = 4.eval() = 4 │
│ / \ │ │ right = 2.eval() = 2 │
│ Number Number│ │ return 4 * 2 = 8 │
│ Node(4) Node(2)│ │ return 3 + 8 = 11 │
└───────────────────┘ └─────────────────────────────────────┘
透传说明:在
3 + 4 * 2例子中,power、implicitMul、postfix层级均无对应运算符,直接透传给下一层:
power无^→ 透传给implicitMulimplicitMul无隐式乘法(如2x)→ 透传给postfixpostfix无阶乘!→ 透传给factorfactor识别数字 → 返回NumberNode
词法分析器(Lexer)将输入字符串分割成一个个Token(词法单元):
| Token 类型 | 示例 | 说明 |
|---|---|---|
NUMBER |
3.14, 1e-5 |
数字(支持浮点数、科学计数法) |
IDENTIFIER |
sin, PI, x |
标识符(函数名、常量、变量) |
PLUS/MINUS/MULTIPLY/DIVIDE |
+, -, *, / |
四则运算符 |
POWER |
^ |
幂运算符 |
LPAREN/RPAREN |
(, ) |
括号 |
示例:输入 "3 + 4 * 2" 被分割为:[NUMBER:3] [PLUS] [NUMBER:4] [MULTIPLY] [NUMBER:2] [EOF]
语法分析器(Parser)使用递归下降技术,根据运算符优先级递归解析表达式。
优先级从高到低:
| 优先级 | 语法规则 | 说明 |
|---|---|---|
| 高 | factor |
数字、标识符、括号表达式、数组 |
| ↑ | postfix |
阶乘(如 5!) |
| ↑ | implicitMul |
隐式乘法(如 2x、3(4+5)) |
| ↑ | power |
幂运算(右结合,如 2^3^2 = 2^(3^2)) |
| ↑ | unary |
一元正负号(如 -3^2 = -(3^2)) |
| ↑ | term |
乘、除、取模 |
| ↑ | addExpr |
加、减 |
| 低 | expr |
赋值表达式 |
语法规则(BNF 范式:→ 表示定义为,| 表示或,(...) 表示分组,* 表示零次或多次,? 表示零次或一次):
expr → IDENTIFIER ASSIGN expr | addExpr
addExpr → term ((PLUS | MINUS) term)*
term → unary ((MULTIPLY | DIVIDE | MODULO) unary)*
unary → (PLUS | MINUS) unary | power
power → implicitMul (POWER power)?
implicitMul → postfix (postfix)*
postfix → factor FACTORIAL?
factor → NUMBER | IDENTIFIER | LPAREN expr RPAREN | LBRACKET ... RBRACKET
解析完成后生成抽象语法树(ExprNode),树节点类型包括:
| 节点类型 | 说明 | 示例 |
|---|---|---|
NumberNode |
数字字面量 | 3.14 |
VariableNode |
变量引用 | x |
BinaryOpNode |
二元运算 | a + b, x * y |
UnaryOpNode |
一元运算 | -x, +5 |
FunctionNode |
函数调用 | sin(PI/2) |
AssignNode |
变量赋值 | x = 10 |
ArrayNode |
数组字面量 | [1, 2, 3] |
AST 示例:表达式 3 + 4 * 2 的语法树:
(+)
/ \
(3) (*)
/ \
(4) (2)
每个 AST 节点实现 eval() 方法,通过递归调用子节点的 eval() 完成求值。
以 BinaryOpNode 为例(源码):
@Override
public double eval(Map<String, Double> context) {
// 1. 递归求值左子树
double leftVal = left.eval(context);
// 2. 递归求值右子树
double rightVal = right.eval(context);
// 3. 应用运算符
switch (op.type()) {
case PLUS: return leftVal + rightVal;
case MINUS: return leftVal - rightVal;
case MULTIPLY: return leftVal * rightVal;
case DIVIDE: return leftVal / rightVal;
case POWER: return Math.pow(leftVal, rightVal);
// ...
}
}NumberNode 直接返回存储的数值(源码):
@Override
public double eval(Map<String, Double> context) {
return value;
}求值过程:对于 3 + 4 * 2
BinaryOpNode(+).eval()→ 需要左值和右值- 左值:
NumberNode(3).eval()→3 - 右值:
BinaryOpNode(*).eval()→ 需要左值和右值 - 左值:
NumberNode(4).eval()→4 - 右值:
NumberNode(2).eval()→2 - 结果:
4 * 2 = 8 - 最终结果:
3 + 8 = 11
| 特性 | 说明 |
|---|---|
| 数学运算 | 加减乘除、幂运算、取模、阶乘 |
| 常量 | PI(圆周率)、E(自然常数) |
| 函数 | 三角函数、对数函数、统计函数等 |
| 变量 | 支持变量赋值和持久化 |
| 数组 | 数组字面量和数组操作 |
| 矩阵 | 转置、行列式、矩阵乘法等 |
| REPL | 交互式命令行,支持历史结果引用 |
- 构建: JDK 11 或更高版本(JDK 8 的 javadoc 不支持
--release参数) - 运行: Java 8 或更高版本
- Maven(用于构建项目)
在项目根目录下运行 Maven 构建命令:
mvn clean package构建完成后,会在 target/ 目录下生成可执行的 JAR 文件。
构建成功后,运行以下命令启动交互式 REPL:
java -jar target/Expression-Parser-1.4.0.jar启动后将显示欢迎界面,输入 help 可查看支持的完整功能列表。
除了 REPL 交互模式,你也可以将本库作为依赖集成到自己的 Java 项目中。
Maven 依赖:
<dependency>
<groupId>cn.czyx007</groupId>
<artifactId>Expression-Parser</artifactId>
<version>1.4.0</version>
</dependency>Gradle 依赖:
implementation 'cn.czyx007:Expression-Parser:1.4.0'使用示例:
import cn.czyx007.expression_parser.api.ExpressionEvaluator;
import cn.czyx007.expression_parser.ast.Value;
// 简单计算
Value result = ExpressionEvaluator.eval("1 + 2 * 3");
System.out.println(result); // 输出: 7
// 使用变量上下文(表达式内赋值,变量可复用)
Map<String, Object> context = new HashMap<>();
Value result1 = ExpressionEvaluator.eval("x = 10; y = 20; x + y", context);
System.out.println(result1); // 输出: 30
// 复用 context 中的变量继续计算
Value result2 = ExpressionEvaluator.eval("x * y", context);
System.out.println(result2); // 输出: 200REPL(Read-Eval-Print Loop)支持以下功能,方便用户实时输入和计算表达式:
| 运算符 | 说明 |
|---|---|
+, -, *, / |
加减乘除 |
% |
取模 |
^ |
幂运算(右结合,如 2^3^2 = 512) |
! |
阶乘(如 5! = 120) |
| 常量 | 说明 |
|---|---|
PI |
圆周率 π ≈ 3.14159... |
E |
自然常数 e ≈ 2.71828... |
sin, cos, tan, asin, acos, atan
sinh, cosh, tanh
exp, ln, log, log10
sqrt, cbrt, abs
ceil, floor, round, signum, sign
degrees, radians, atan2(y,x), hypot(x,y), pow(base,exponent)
对于向量 X = [...]; Y = [...];(也可手动按序展开输入):
| 函数 | 说明 | 参数要求 |
|---|---|---|
max(X), min(X) |
极值 | 2+ 个 |
sum(X) |
求和 | 1+ 个 |
count(X) |
参数计数 | - |
avg(X) |
平均值 | 1+ 个 |
median(X) |
中位数 | 1+ 个 |
prod(X), product(X) |
乘积 | 1+ 个 |
| 函数 | 说明 | 参数要求 |
|---|---|---|
var(X), variance(X) |
样本方差 | ≥2 |
std(X), stddev(X) |
样本标准差 | ≥2 |
varp(X), variancep(X) |
总体方差 | ≥1 |
stdp(X), stddevp(X) |
总体标准差 | ≥1 |
| 函数 | 说明 | 参数要求 |
|---|---|---|
gcd(X), lcm(X) |
最大公约数/最小公倍数 | 整数,2+ 个 |
range(X) |
极差 | 1+ 个 |
geomean(X) |
几何平均数 | 1+ 个正数 |
norm1(X), sumabs(X) |
绝对值和 (L1) | - |
norm2(X), rms(X) |
欧几里得范数 (L2)/均方根 | - |
percentile(p,X), pctl(p,X) |
百分位数 | p ∈ [0,100] |
| 函数 | 说明 |
|---|---|
cov(X,Y), covariance(X,Y) |
样本协方差 |
covp(X,Y), covariancep(X,Y) |
总体协方差 |
corr(X,Y), correlation(X,Y) |
相关系数 |
| 函数 | 说明 |
|---|---|
dot(X,Y), dotprod(X,Y) |
向量点积 |
dist(X,Y), distance(X,Y), euclidean(X,Y) |
欧几里得距离 |
manhattan(X,Y), taxicab(X,Y) |
曼哈顿距离 |
| 函数 | 说明 |
|---|---|
transpose(matrix), t(matrix) |
矩阵/向量转置 |
det(matrix), determinant(matrix) |
行列式 |
matmul(matrix_A, matrix_B) |
矩阵乘法 |
trace(matrix) |
矩阵的迹(主对角线之和) |
rank(matrix) |
矩阵的秩 |
mean(matrix, axis) |
矩阵均值(axis=0 列,axis=1 行) |
inv(matrix) |
矩阵求逆 |
solve(A, b) |
解线性方程组 Ax=b |
| 函数 | 说明 |
|---|---|
C(n,k), comb(n,k) |
组合数 |
P(n,k), perm(n,k) |
排列数 |
x = 10- 赋值x = 10; y = 2x- 多语句(分号分隔)ans- 上一次计算结果
[1, 2, 3]- 数组字面量[[1,2], [3,4]]- 二维数组(矩阵)scores = [1, 2, 3]- 数组变量赋值avg(scores)- 数组作为函数参数
2PI- 数字与常量3(4+5)- 数字与括号2sqrt(4)- 数字与函数
| 命令 | 说明 |
|---|---|
help, ? |
显示帮助 |
vars |
显示已定义的变量 |
clear |
清除所有变量 |
exit, quit, q |
退出 |
项目包含全面的单元测试。运行以下命令执行测试:
mvn test- src/main/java:包含主代码,包括词法分析器、语法分析器和 REPL。
- src/test/java:包含各组件的单元测试。
- target/:生成的文件和编译后的类。
此项目基于 MIT 许可证开源。详情请参阅 LICENSE 文件。
Expression Parser is a Java project based on recursive descent parsing technology, supporting mathematical expression parsing and calculation for scalars, vectors, and matrices. The project also provides an interactive REPL (Read-Eval-Print Loop) terminal for users to input and calculate expressions in real-time.
This project adopts a classic compiler frontend architecture, dividing expression parsing into four stages:
Input: "3 + 4 * 2"
Stage 1: Lexical Analysis (Lexer) Stage 2: Syntax Analysis (Parser)
┌───────────────────┐ ┌───────────────────────────────────────┐
│ NUMBER(3) │ │ expr │
│ PLUS(+) │ │ └── addExpr │
│ NUMBER(4) │ ────────────▶ │ ├── term │
│ MULTIPLY(*) │ Token Seq │ │ └── unary │
│ NUMBER(2) │ │ │ └── power │
│ EOF │ │ │ └── implicitMul │
└───────────────────┘ │ │ └── postfix │
│ │ └── factor │
│ │ └── 3 │
│ ├── PLUS │
│ └── term │
│ ├── unary │
│ │ └── power │
│ │ └── implicitMul │
│ │ └── postfix │
│ │ └── factor│
│ │ └── 4 │
│ ├── MULTIPLY │
│ └── unary │
│ └── power │
│ └── implicitMul │
│ └── postfix │
│ └── factor│
│ └── 2 │
└───────────────────────────────────────┘
Stage 3: Build AST Stage 4: Expression Evaluation (Eval)
┌───────────────────┐ ┌─────────────────────────────────────┐
│ BinaryOpNode(+) │ │ BinaryOpNode(+) │
│ / \ │ │ left = 3.eval() = 3 │
│ Number BinaryOp│ ────────────▶ │ right = *.eval() │
│ Node(3) Node(*) │ │ left = 4.eval() = 4 │
│ / \ │ │ right = 2.eval() = 2 │
│ Number Number│ │ return 4 * 2 = 8 │
│ Node(4) Node(2)│ │ return 3 + 8 = 11 │
└───────────────────┘ └─────────────────────────────────────┘
Pass-through Note: In the
3 + 4 * 2example, thepower,implicitMul, andpostfixlevels have no corresponding operators and directly pass through to the next level:
powerhas no^→ passes toimplicitMulimplicitMulhas no implicit multiplication (e.g.,2x) → passes topostfixpostfixhas no factorial!→ passes tofactorfactorrecognizes the number → returnsNumberNode
The Lexer (source) splits the input string into Tokens (lexical units):
| Token Type | Example | Description |
|---|---|---|
NUMBER |
3.14, 1e-5 |
Numbers (supports floating-point and scientific notation) |
IDENTIFIER |
sin, PI, x |
Identifiers (function names, constants, variables) |
PLUS/MINUS/MULTIPLY/DIVIDE |
+, -, *, / |
Arithmetic operators |
POWER |
^ |
Power operator |
LPAREN/RPAREN |
(, ) |
Parentheses |
Example: Input "3 + 4 * 2" is split into: [NUMBER:3] [PLUS] [NUMBER:4] [MULTIPLY] [NUMBER:2] [EOF]
The Parser (source) uses recursive descent technology to recursively parse expressions based on operator precedence.
Precedence from high to low:
| Precedence | Grammar Rule | Description |
|---|---|---|
| High | factor |
Numbers, identifiers, parenthesized expressions, arrays |
| ↑ | postfix |
Factorial (e.g., 5!) |
| ↑ | implicitMul |
Implicit multiplication (e.g., 2x, 3(4+5)) |
| ↑ | power |
Power operation (right-associative, e.g., 2^3^2 = 2^(3^2)) |
| ↑ | unary |
Unary plus/minus signs (e.g., -3^2 = -(3^2)) |
| ↑ | term |
Multiplication, division, modulo |
| ↑ | addExpr |
Addition, subtraction |
| Low | expr |
Assignment expressions |
Grammar Rules (BNF notation: → means defined as, | means or, (...) means grouping, * means zero or more, ? means zero or one):
expr → IDENTIFIER ASSIGN expr | addExpr
addExpr → term ((PLUS | MINUS) term)*
term → unary ((MULTIPLY | DIVIDE | MODULO) unary)*
unary → (PLUS | MINUS) unary | power
power → implicitMul (POWER power)?
implicitMul → postfix (postfix)*
postfix → factor FACTORIAL?
factor → NUMBER | IDENTIFIER | LPAREN expr RPAREN | LBRACKET ... RBRACKET
After parsing, an Abstract Syntax Tree (ExprNode) is generated. Tree node types include:
| Node Type | Description | Example |
|---|---|---|
NumberNode |
Numeric literal | 3.14 |
VariableNode |
Variable reference | x |
BinaryOpNode |
Binary operation | a + b, x * y |
UnaryOpNode |
Unary operation | -x, +5 |
FunctionNode |
Function call | sin(PI/2) |
AssignNode |
Variable assignment | x = 10 |
ArrayNode |
Array literal | [1, 2, 3] |
AST Example: The syntax tree for expression 3 + 4 * 2:
(+)
/ \
(3) (*)
/ \
(4) (2)
Each AST node implements the eval() method to complete evaluation through recursive calls to child nodes' eval().
Taking BinaryOpNode as an example (source):
@Override
public double eval(Map<String, Double> context) {
// 1. Recursively evaluate left subtree
double leftVal = left.eval(context);
// 2. Recursively evaluate right subtree
double rightVal = right.eval(context);
// 3. Apply operator
switch (op.type()) {
case PLUS: return leftVal + rightVal;
case MINUS: return leftVal - rightVal;
case MULTIPLY: return leftVal * rightVal;
case DIVIDE: return leftVal / rightVal;
case POWER: return Math.pow(leftVal, rightVal);
// ...
}
}NumberNode directly returns the stored value (source):
@Override
public double eval(Map<String, Double> context) {
return value;
}Evaluation Process: For 3 + 4 * 2
BinaryOpNode(+).eval()→ needs left and right values- Left value:
NumberNode(3).eval()→3 - Right value:
BinaryOpNode(*).eval()→ needs left and right values - Left value:
NumberNode(4).eval()→4 - Right value:
NumberNode(2).eval()→2 - Result:
4 * 2 = 8 - Final result:
3 + 8 = 11
| Feature | Description |
|---|---|
| Mathematical Operations | Addition, subtraction, multiplication, division, power, modulo, factorial |
| Constants | PI (pi), E (natural constant) |
| Functions | Trigonometric, logarithmic, statistical functions, etc. |
| Variables | Support variable assignment and persistence |
| Arrays | Array literals and array operations |
| Matrices | Transpose, determinant, matrix multiplication, etc. |
| REPL | Interactive command line with history result reference |
- Build: JDK 11 or higher (JDK 8's javadoc doesn't support
--releaseflag) - Runtime: Java 8 or higher
- Maven (for building the project)
Run the Maven build command in the project root directory:
mvn clean packageAfter building, an executable JAR file will be generated in the target/ directory.
After successful build, run the following command to start the interactive REPL:
java -jar target/Expression-Parser-1.4.0.jarAfter startup, a welcome screen will be displayed. Enter help to view the complete list of supported features.
In addition to REPL interactive mode, you can also integrate this library as a dependency into your own Java project.
Maven Dependency:
<dependency>
<groupId>cn.czyx007</groupId>
<artifactId>Expression-Parser</artifactId>
<version>1.4.0</version>
</dependency>Gradle Dependency:
implementation 'cn.czyx007:Expression-Parser:1.4.0'Usage Example:
import cn.czyx007.expression_parser.api.ExpressionEvaluator;
import cn.czyx007.expression_parser.ast.Value;
// Simple calculation
Value result = ExpressionEvaluator.eval("1 + 2 * 3");
System.out.println(result); // Output: 7
// With variable context (assign in expression, variables reusable)
Map<String, Object> context = new HashMap<>();
Value result1 = ExpressionEvaluator.eval("x = 10; y = 20; x + y", context);
System.out.println(result1); // Output: 30
// Reuse variables from context
Value result2 = ExpressionEvaluator.eval("x * y", context);
System.out.println(result2); // Output: 200REPL (Read-Eval-Print Loop) supports the following features for users to input and calculate expressions in real-time:
| Operator | Description |
|---|---|
+, -, *, / |
Addition, subtraction, multiplication, division |
% |
Modulo |
^ |
Power operation (right-associative, e.g., 2^3^2 = 512) |
! |
Factorial (e.g., 5! = 120) |
| Constant | Description |
|---|---|
PI |
Pi π ≈ 3.14159... |
E |
Natural constant e ≈ 2.71828... |
sin, cos, tan, asin, acos, atan
sinh, cosh, tanh
exp, ln, log, log10
sqrt, cbrt, abs
ceil, floor, round, signum, sign
degrees, radians, atan2(y,x), hypot(x,y), pow(base,exponent)
For vectors X = [...]; Y = [...]; (can also be manually expanded in order):
| Function | Description | Parameter Requirements |
|---|---|---|
max(X), min(X) |
Extremes | 2+ items |
sum(X) |
Sum | 1+ items |
count(X) |
Parameter count | - |
avg(X) |
Average | 1+ items |
median(X) |
Median | 1+ items |
prod(X), product(X) |
Product | 1+ items |
| Function | Description | Parameter Requirements |
|---|---|---|
var(X), variance(X) |
Sample variance | ≥2 |
std(X), stddev(X) |
Sample standard deviation | ≥2 |
varp(X), variancep(X) |
Population variance | ≥1 |
stdp(X), stddevp(X) |
Population standard deviation | ≥1 |
| Function | Description | Parameter Requirements |
|---|---|---|
gcd(X), lcm(X) |
GCD/LCM | Integers, 2+ items |
range(X) |
Range | 1+ items |
geomean(X) |
Geometric mean | 1+ positive numbers |
norm1(X), sumabs(X) |
Sum of absolute values (L1) | - |
norm2(X), rms(X) |
Euclidean norm (L2)/RMS | - |
percentile(p,X), pctl(p,X) |
Percentile | p ∈ [0,100] |
| Function | Description |
|---|---|
cov(X,Y), covariance(X,Y) |
Sample covariance |
covp(X,Y), covariancep(X,Y) |
Population covariance |
corr(X,Y), correlation(X,Y) |
Correlation coefficient |
| Function | Description |
|---|---|
dot(X,Y), dotprod(X,Y) |
Vector dot product |
dist(X,Y), distance(X,Y), euclidean(X,Y) |
Euclidean distance |
manhattan(X,Y), taxicab(X,Y) |
Manhattan distance |
| Function | Description |
|---|---|
transpose(matrix), t(matrix) |
Matrix/vector transpose |
det(matrix), determinant(matrix) |
Determinant |
matmul(matrix_A, matrix_B) |
Matrix multiplication |
trace(matrix) |
Matrix trace (sum of main diagonal) |
rank(matrix) |
Matrix rank |
mean(matrix, axis) |
Matrix mean (axis=0 columns, axis=1 rows) |
inv(matrix) |
Matrix inverse |
solve(A, b) |
Solve linear equations Ax=b |
| Function | Description |
|---|---|
C(n,k), comb(n,k) |
Combinations |
P(n,k), perm(n,k) |
Permutations |
x = 10- Assignmentx = 10; y = 2x- Multiple statements (semicolon-separated)ans- Previous calculation result
[1, 2, 3]- Array literal[[1,2], [3,4]]- 2D array (matrix)scores = [1, 2, 3]- Array variable assignmentavg(scores)- Array as function parameter
2PI- Number and constant3(4+5)- Number and parentheses2sqrt(4)- Number and function
| Command | Description |
|---|---|
help, ? |
Display help |
vars |
Display defined variables |
clear |
Clear all variables |
exit, quit, q |
Exit |
The project includes comprehensive unit tests. Run the following command to execute tests:
mvn test- src/main/java: Contains main code, including lexer, parser, and REPL.
- src/test/java: Contains unit tests for each component.
- target/: Generated files and compiled classes.
This project is open-sourced under the MIT License. See LICENSE file for details.