Efficient path tracking and pattern matching for XML, JSON, YAML or any other parsers.
path-expression-matcher provides two core classes for tracking and matching paths:
Expression: Parses and stores pattern expressions (e.g.,"root.users.user[id]")Matcher: Tracks current path during parsing and matches against expressions
Compatible with fast-xml-parser and similar tools.
npm install path-expression-matcherimport { Expression, Matcher } from 'path-expression-matcher';
// Create expression (parse once, reuse many times)
const expr = new Expression("root.users.user");
// Create matcher (tracks current path)
const matcher = new Matcher();
matcher.push("root");
matcher.push("users");
matcher.push("user", { id: "123" });
// Match current path against expression
if (matcher.matches(expr)) {
console.log("Match found!");
console.log("Current path:", matcher.toString()); // "root.users.user"
}"root.users.user" // Exact path match
"*.users.user" // Wildcard: any parent
"root.*.user" // Wildcard: any middle
"root.users.*" // Wildcard: any child"..user" // user anywhere in tree
"root..user" // user anywhere under root
"..users..user" // users somewhere, then user below it"user[id]" // user with "id" attribute
"user[type=admin]" // user with type="admin" (current node only)
"root[lang]..user" // user under root that has "lang" attribute"user:first" // First user (counter=0)
"user:nth(2)" // Third user (counter=2, zero-based)
"user:odd" // Odd-numbered users (counter=1,3,5...)
"user:even" // Even-numbered users (counter=0,2,4...)
"root.users.user:first" // First user under usersNote: Position selectors use the counter (occurrence count of the tag name), not the position (child index). For example, in <root><a/><b/><a/></root>, the second <a/> has position=2 but counter=1.
"..user[id]:first" // First user with id, anywhere
"root..user[type=admin]" // Admin user under rootnew Expression(pattern, options)Parameters:
pattern(string): Pattern to parseoptions.separator(string): Path separator (default:'.')
Example:
const expr1 = new Expression("root.users.user");
const expr2 = new Expression("root/users/user", { separator: '/' });hasDeepWildcard()β booleanhasAttributeCondition()β booleanhasPositionSelector()β booleantoString()β string
new Matcher(options)Parameters:
options.separator(string): Default path separator (default:'.')
Add a tag to the current path. Position and counter are automatically calculated.
Parameters:
tagName(string): Tag nameattrValues(object, optional): Attribute key-value pairs (current node only)
Example:
matcher.push("user", { id: "123", type: "admin" });
matcher.push("item"); // No attributesPosition vs Counter:
- Position: The child index in the parent (0, 1, 2, 3...)
- Counter: How many times this tag name appeared at this level (0, 1, 2...)
Example:
<root>
<a/> <!-- position=0, counter=0 -->
<b/> <!-- position=1, counter=0 -->
<a/> <!-- position=2, counter=1 -->
</root>Remove the last tag from the path.
matcher.pop();Update current node's attributes (useful when attributes are parsed after push).
matcher.push("user"); // Don't know values yet
// ... parse attributes ...
matcher.updateCurrent({ id: "123" });Clear the entire path.
matcher.reset();Check if current path matches an Expression.
const expr = new Expression("root.users.user");
if (matcher.matches(expr)) {
// Current path matches
}Get current tag name.
const tag = matcher.getCurrentTag(); // "user"Get attribute value of current node.
const id = matcher.getAttrValue("id"); // "123"Check if current node has an attribute.
if (matcher.hasAttr("id")) {
// Current node has "id" attribute
}Get sibling position of current node (child index in parent).
const position = matcher.getPosition(); // 0, 1, 2, ...Get repeat counter of current node (occurrence count of this tag name).
const counter = matcher.getCounter(); // 0, 1, 2, ...Alias for getPosition(). Use getPosition() or getCounter() instead for clarity.
const index = matcher.getIndex(); // Same as getPosition()Get current path depth.
const depth = matcher.getDepth(); // 3 for "root.users.user"Get path as string.
const path = matcher.toString(); // "root.users.user"
const path2 = matcher.toString('/'); // "root/users/user"Get path as array.
const arr = matcher.toArray(); // ["root", "users", "user"]Create a snapshot of current state.
const snapshot = matcher.snapshot();Restore from a snapshot.
matcher.restore(snapshot);import { XMLParser } from 'fast-xml-parser';
import { Expression, Matcher } from 'path-expression-matcher';
class MyParser {
constructor() {
this.matcher = new Matcher();
// Pre-compile stop node patterns
this.stopNodeExpressions = [
new Expression("html.body.script"),
new Expression("html.body.style"),
new Expression("..svg"),
];
}
parseTag(tagName, attrs) {
this.matcher.push(tagName, attrs);
// Check if this is a stop node
for (const expr of this.stopNodeExpressions) {
if (this.matcher.matches(expr)) {
// Don't parse children, read as raw text
return this.readRawContent();
}
}
// Continue normal parsing
this.parseChildren();
this.matcher.pop();
}
}const matcher = new Matcher();
const userExpr = new Expression("..user[type=admin]");
const firstItemExpr = new Expression("..item:first");
function processTag(tagName, value, attrs) {
matcher.push(tagName, attrs);
if (matcher.matches(userExpr)) {
value = enhanceAdminUser(value);
}
if (matcher.matches(firstItemExpr)) {
value = markAsFirst(value);
}
matcher.pop();
return value;
}const patterns = [
new Expression("data.users.user"),
new Expression("data.posts.post"),
new Expression("..comment[approved=true]"),
];
function shouldInclude(matcher) {
return patterns.some(expr => matcher.matches(expr));
}const matcher = new Matcher({ separator: '/' });
const expr = new Expression("root/config/database", { separator: '/' });
matcher.push("root");
matcher.push("config");
matcher.push("database");
console.log(matcher.toString()); // "root/config/database"
console.log(matcher.matches(expr)); // trueconst matcher = new Matcher();
matcher.push("root");
matcher.push("user", { id: "123", type: "admin", status: "active" });
// Check attribute existence (current node only)
console.log(matcher.hasAttr("id")); // true
console.log(matcher.hasAttr("email")); // false
// Get attribute value (current node only)
console.log(matcher.getAttrValue("type")); // "admin"
// Match by attribute
const expr1 = new Expression("user[id]");
console.log(matcher.matches(expr1)); // true
const expr2 = new Expression("user[type=admin]");
console.log(matcher.matches(expr2)); // trueconst matcher = new Matcher();
matcher.push("root");
// Mixed tags at same level
matcher.push("item"); // position=0, counter=0 (first item)
matcher.pop();
matcher.push("div"); // position=1, counter=0 (first div)
matcher.pop();
matcher.push("item"); // position=2, counter=1 (second item)
console.log(matcher.getPosition()); // 2 (third child overall)
console.log(matcher.getCounter()); // 1 (second "item" specifically)
// :first uses counter, not position
const expr = new Expression("root.item:first");
console.log(matcher.matches(expr)); // false (counter=1, not 0)Ancestor nodes: Store only tag name, position, and counter (minimal memory) Current node: Store tag name, position, counter, and attribute values
This design minimizes memory usage:
- No attribute names stored (derived from values object when needed)
- Attribute values only for current node, not ancestors
- Attribute checking for ancestors is not supported (acceptable trade-off)
- For 1M nodes with 3 attributes each, saves ~50MB vs storing attribute names
Matching is performed bottom-to-top (from current node toward root):
- Start at current node
- Match segments from pattern end to start
- Attribute checking only works for current node (ancestors have no attribute data)
- Position selectors use counter (occurrence count), not position (child index)
- Expression parsing: One-time cost when Expression is created
- Expression analysis: Cached (hasDeepWildcard, hasAttributeCondition, hasPositionSelector)
- Path tracking: O(1) for push/pop operations
- Pattern matching: O(n*m) where n = path depth, m = pattern segments
- Memory per ancestor node: ~40-60 bytes (tag, position, counter only)
- Memory per current node: ~80-120 bytes (adds attribute values)
// β
GOOD: Parse once, reuse many times
const expr = new Expression("..user[id]");
for (let i = 0; i < 1000; i++) {
if (matcher.matches(expr)) {
// ...
}
}// β BAD: Parse on every iteration
for (let i = 0; i < 1000; i++) {
if (matcher.matches(new Expression("..user[id]"))) {
// ...
}
}// For multiple patterns, check all at once
const patterns = [
new Expression("..user"),
new Expression("..post"),
new Expression("..comment"),
];
function matchesAny(matcher, patterns) {
return patterns.some(expr => matcher.matches(expr));
}Basic integration:
import { XMLParser } from 'fast-xml-parser';
import { Expression, Matcher } from 'path-expression-matcher';
const parser = new XMLParser({
// Custom options using path-expression-matcher
stopNodes: ["script", "style"].map(tag => new Expression(`..${tag}`)),
tagValueProcessor: (tagName, value, jPath, hasAttrs, isLeaf, matcher) => {
// matcher is available in callbacks
if (matcher.matches(new Expression("..user[type=admin]"))) {
return enhanceValue(value);
}
return value;
}
});npm testAll 77 tests covering:
- Pattern parsing (exact, wildcards, attributes, position)
- Path tracking (push, pop, update)
- Pattern matching (all combinations)
- Edge cases and error conditions
MIT
Issues and PRs welcome! This package is designed to be used by XML/JSON parsers like fast-xml-parser.