Skip to content

feat: add standard input ("-") support across CLI commands#661

Open
Vaibhav701161 wants to merge 10 commits intosourcemeta:mainfrom
Vaibhav701161:feat/stdin-support
Open

feat: add standard input ("-") support across CLI commands#661
Vaibhav701161 wants to merge 10 commits intosourcemeta:mainfrom
Vaibhav701161:feat/stdin-support

Conversation

@Vaibhav701161
Copy link
Contributor

@Vaibhav701161 Vaibhav701161 commented Feb 17, 2026

Add support for reading JSON input from standard input using "-"
as an argument across format, lint, validate, and metaschema
commands.

Key changes:

  • Introduce read_from_stdin() for JSON parsing from std::cin
  • Add StdinError for structured stdin-related failures
  • Prevent multiple "-" arguments via centralized duplicate check
  • Support schema-from-stdin in validate
  • Disallow conflicting combinations (e.g. schema and instance both from stdin)
  • Restrict certain options with stdin (e.g. --check, --fix)
  • Ensure consistent JSON error output with --json flag

For stdin inputs, the current working directory is used as the
resolution base for relative references.

Comprehensive tests added for:

  • successful stdin usage
  • duplicate "-" handling
  • schema-from-stdin
  • invalid combinations
  • JSON error output behavior

fixes #537

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 issues found across 20 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="test/validate/fail_stdin_multiple.sh">

<violation number="1" location="test/validate/fail_stdin_multiple.sh:22">
P3: Failure test only asserts text stderr output; missing required `--json` error assertion for structured output.</violation>
</file>

<file name="test/format/fail_stdin_check.sh">

<violation number="1" location="test/format/fail_stdin_check.sh:11">
P2: Failure test only validates text output; missing JSON error assertion for this failure case.</violation>
</file>

<file name="test/validate/fail_stdin_schema.sh">

<violation number="1" location="test/validate/fail_stdin_schema.sh:21">
P2: Failing test only asserts human-readable stderr; it should also validate the JSON error output ("--json") per team testing requirements.</violation>
</file>

<file name="test/validate/fail_stdin_instance.sh">

<violation number="1" location="test/validate/fail_stdin_instance.sh:14">
P2: Test only asserts a non-zero exit, but the comment and related tests expect exit code 2; this can mask wrong failure modes.</violation>
</file>

<file name="test/lint/fail_stdin_fix.sh">

<violation number="1" location="test/lint/fail_stdin_fix.sh:11">
P2: Failure test only asserts the human-readable error output; per team guidance, add a JSON variant (`--json` or companion *_json test) to validate structured error output for this failure case.</violation>
</file>

<file name="test/CMakeLists.txt">

<violation number="1" location="test/CMakeLists.txt:75">
P2: New stdin failure tests are registered without corresponding `_json` variants, leaving JSON error output untested for these cases despite the project guideline to include JSON variants for fail_* tests.</violation>
</file>

<file name="src/command_lint.cc">

<violation number="1" location="src/command_lint.cc:272">
P2: Stdin compatibility with `--fix` is validated inside the fix loop, so some files may be modified before encountering a stdin entry and throwing `StdinError`. Validate `from_stdin` before starting modifications to avoid partial execution.</violation>
</file>

<file name="src/command_fmt.cc">

<violation number="1" location="src/command_fmt.cc:40">
P2: Standard input formatting bypasses the try/catch wrapper used for file inputs, so schema-related exceptions will propagate without FileError context/normalization.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@Vaibhav701161 Vaibhav701161 force-pushed the feat/stdin-support branch 3 times, most recently from 7bfe1ba to b624d7a Compare February 20, 2026 08:33
@Vaibhav701161
Copy link
Contributor Author

@jviotti

I’ve pushed the updated version addressing the earlier feedback.

This adds support for - (stdin) across fmt, lint, validate, and metaschema. I introduced a StdinError so stdin-related failures are handled consistently (including with --json), and centralized duplicate - detection inside for_each_json.

In validate, schema-from-stdin is supported, but using stdin for both schema and instance is explicitly rejected. For lint --fix, I now check for stdin before doing any modifications to avoid partial execution. I also wrapped the stdin path in fmt with the same exception normalization as file inputs.

On the testing side, I added coverage for:

Successful stdin usage for each command

Duplicate - handling

Schema-from-stdin and instance-from-stdin cases

Conflicting combinations

Correct exit codes (including exit code 2 for validation failures)

JSON error output variants for all failure cases

All existing tests pass along with the new ones.

Would appreciate your review, especially on whether the stdin semantics feel consistent with the rest of the CLI.

clean() { rm -rf "$TMP"; }
trap clean EXIT

# lint --fix does not support stdin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? I think it can still work just fine?

Copy link
Contributor Author

@Vaibhav701161 Vaibhav701161 Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the implementation to support lint --fix with stdin. Instead of rejecting it, stdin is now formatted and written to stdout when --fix is provided. I also replaced the failing tests with pass_stdin_fix to reflect the supported behavior.

clean() { rm -rf "$TMP"; }
trap clean EXIT

# fmt --check does not support stdin (JSON error output)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had now implemented fmt --check support for stdin by locally buffering stdin only for comparison purposes (consistent with how files are handled). It now returns the appropriate failure output and exit code instead of rejecting stdin outright.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is totally different from all the other ones

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I’ve refactored it to follow the same structure as the rest of the test suite:

Uses set -o errexit / set -o nounset

Captures output into a temporary file

Performs a full diff assertion

It now matches the conventions used elsewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Please follow the testing conventions consistently. You did various of them well, but then many are very minimal and weird like this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this test to follow the same structure and assertion pattern as other pass tests (temporary directory, captured output, full diff). It’s now consistent with the existing test suite style.

test "$EXIT_CODE" = "2" || exit 1

# Validate that JSON output has the correct structure
grep -q '"valid": false' "$TMP/output.json" || exit 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never use grep in any other test. They always do a full diff assert. Please copy the existing tests conventions as close as possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had replaced the grep checks with a full JSON diff assertion, matching the existing fail tests. The expected JSON output is now explicitly defined and compared using diff.

}
EOF

# Should fail validation via stdin (exit code 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to explain these things over and over again in every test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed this too


echo '"foo"' | "$1" validate "$TMP/schema.json" - - 2>"$TMP/stderr.txt" \
&& EXIT_CODE="$?" || EXIT_CODE="$?"
test "$EXIT_CODE" = "1" || exit 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note that every fail_ test does a --json assertion afterwards. Please follow the conventions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out. I updated the test to follow the same pattern as other fail_ tests by including a --json variant and performing the appropriate diff assertion afterward.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You still didn't hear though. Instead of having fail_stdin_multiple.sh and fail_stdin_multiple_json.sh, you can do both the text and --json variants in this one itself. We do that in a bunch of tests to avoid duplicating them a lot.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(only for the fail ones)

@Vaibhav701161
Copy link
Contributor Author

@jviotti , I had fixed all the reviews , kindly review it again.


cat << 'EOF' > "$TMP/expected.json"
{
"error": "Cannot read both schema and instance from standard input"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we had an error for using - twice. Why not let that throw here?

}
EOF

echo '123' | "$1" validate "$TMP/schema.json" - --json >"$TMP/stdout.json" 2>"$TMP/stderr.txt" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just pipe stdout and stderr to the same file. I think that's fine and will simplify this test

test "$EXIT_CODE" = "2" || exit 1

cat << EOF > "$TMP/expected.txt"
fail: $(pwd)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why $(pwd)? That is very confusing. Can you make it <stdin>?

Suggested change
fail: $(pwd)
fail: <stdin>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of this test file does not really explain what you are testing here: double hyphens. Maybe do another pass to make sure the names at more meaningful? i.e. fail_stdin_hyphen_schema_instance

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm wrong, I think you never test a failed validation where the schema comes from standard input

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are never testing a lint failure from standard input?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For most of these, can you also exercise --verbose variants as different test files? --verbose tends to print file paths, and we need to be certain we don't print non sense on the standard input cases

cat << 'EOF' > "$TMP/expected.json"
{
"valid": false,
"errors": [ "(stdin)" ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"errors": [ "(stdin)" ]
"errors": [ "<stdin>" ]

A bit minor, but I think that's a most common convention

test "$EXIT_CODE" = "2" || exit 1

cat << 'EOF' > "$TMP/expected.txt"
fail: (stdin)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fail: (stdin)
fail: <stdin>

@jviotti
Copy link
Member

jviotti commented Feb 25, 2026

I think there are a few more commands you need to exercise, like bundle, test, canonicalize, inspect, etc. I think the key here is have a ton of testing for every possible scenario we can think of. Once the tests are good, polishing the implementation will be trivial.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 35 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="test/validate/fail_stdin_instance_trace.sh">

<violation number="1" location="test/validate/fail_stdin_instance_trace.sh:17">
P2: Failure test lacks a `--json` assertion variant, so structured error output for this stdin+trace failure path is untested.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.


cat << EOF > "$TMP/expected.txt"
error: The schema file you provided does not represent a valid JSON Schema
at file path $CWD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still disagree with using CWD here. It is very confusing output. If any, we can create a "fake" file path . Or something else like at standard input. Otherwise with the current error, the issue points at the directory, which as nothing to do with standard input?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would have been nice to just set the file path to /dev/stdin on UNIX systems. Though not sure if there is a Windows equivalent...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe set the path to /dev/stdin on macOS and Linux? And on Windows just set the path to <stdin>? That might be enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve updated the implementation.

&& EXIT_CODE="$?" || EXIT_CODE="$?"
{"b":2,"a":1,"$schema":"https://json-schema.org/draft/2020-12/schema"}
EOF
test "$EXIT_CODE" = "0" || exit 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this. set -o errexit does exactly this for every command

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had removed those checks.

clean() { rm -rf "$TMP"; }
trap clean EXIT

cat << 'EOF' | "$1" lint - >"$TMP/output.txt" 2>&1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cat << 'EOF' | "$1" lint - >"$TMP/output.txt" 2>&1
cat << 'EOF' | "$1" lint - > "$TMP/output.txt" 2>&1

test "$EXIT_CODE" = "6" || exit 1

cat << 'EOF' > "$TMP/expected.txt"
error: Failed to parse the JSON document
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other tests seem to point at file path /dev/null, but this one doesn't? Can we make it all consistent?

test "$EXIT_CODE" = "6" || exit 1

cat << 'EOF' > "$TMP/expected.txt"
error: Failed to parse the JSON document
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Should mention /dev/stdin like some other errors?

test "$EXIT_CODE" = "6" || exit 1

cat << 'EOF' > "$TMP/expected.txt"
error: Failed to parse the JSON document
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. /dev/stdin like the rest

test "$EXIT_CODE" = "6" || exit 1

cat << 'EOF' > "$TMP/expected.txt"
error: Failed to parse the JSON document
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. /dev/stdin like the rest

cat << EOF > "$TMP/expected.txt"
(RESOURCE) URI: file://$CWD
Type : Static
Root : file://$CWD
Copy link
Member

@jviotti jviotti Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CWD makes no sense here. It looks like you are defaulting the id of a stdin schema to it. If any, this should be file:///dev/stdin. Remember that at least on a UNIX system, the logical path of a standard input file is always /dev/stdin. CWD never makes any sense in this context!

test "$EXIT_CODE" = "6" || exit 1

cat << 'EOF' > "$TMP/expected.txt"
error: Failed to parse the JSON document
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. /dev/stdin like the rest

test "$EXIT_CODE" = "6" || exit 1

cat << 'EOF' > "$TMP/expected.txt"
error: Failed to parse the JSON document
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. /dev/stdin like the rest

test "$EXIT_CODE" = "6" || exit 1

cat << 'EOF' > "$TMP/expected.txt"
error: Failed to parse the JSON document
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. /dev/stdin like the rest

test "$EXIT_CODE" = "2" || exit 1

cat << 'EOF' > "$TMP/expected.txt"
fail: <stdin>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other inconsistency I see here is the use of <stdin> vs /dev/stdin. I think we previously discussed <stdin>, but since we came up with the idea of /dev/stdin, we should use the latter everywhere?

test "$EXIT_CODE" = "6" || exit 1

cat << 'EOF' > "$TMP/expected.txt"
error: Failed to parse the JSON document
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. /dev/stdin like the rest

test "$EXIT_CODE" = "6" || exit 1

cat << 'EOF' > "$TMP/expected.txt"
error: Failed to parse the JSON document
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. /dev/stdin like the rest


diff "$TMP/stdout.txt" "$TMP/expected.txt"

# Schema from stdin without $id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this a different test file

EOF
test "$EXIT_CODE" = "2" || exit 1

CWD="$(realpath .)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comment on CWD. This is incorrect. I would have expected file:///dev/stdin

test "$EXIT_CODE" = "6" || exit 1

cat << 'EOF' > "$TMP/expected.txt"
error: Failed to parse the JSON document
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. /dev/stdin like the rest

test "$EXIT_CODE" = "2" || exit 1

cat << 'EOF' > "$TMP/expected.txt"
fail: <stdin>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fail: <stdin>
fail: /dev/stdin

test "$EXIT_CODE" = "2" || exit 1

cat << EOF > "$TMP/expected.txt"
<stdin>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<stdin>
/dev/stdin

}
EOF

echo '"foo"' | "$1" validate "$TMP/schema.json" - >"$TMP/output.txt" 2>&1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
echo '"foo"' | "$1" validate "$TMP/schema.json" - >"$TMP/output.txt" 2>&1
echo '"foo"' | "$1" validate "$TMP/schema.json" - > "$TMP/output.txt" 2>&1

Can you always add a space after the > for consistency? Worth checking the other ones

EOF

echo '{ "foo": "bar" }' | "$1" validate "$TMP/schema.json" - --benchmark > "$TMP/output.txt"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unintended blank line?

}
EOF

echo '{ "foo": "bar" }' | "$1" validate "$TMP/schema.json" - --benchmark > "$TMP/output.txt"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also you are not asserting on the output at all?

echo '{ "foo": "bar" }' | "$1" validate "$TMP/schema.json" - --verbose 2>"$TMP/stderr.txt"

cat << EOF > "$TMP/expected.txt"
ok: <stdin>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ok: <stdin>
ok: /dev/stdin

cat << EOF > "$TMP/expected.txt"
ok: $(realpath "$TMP")/instance1.json
matches $(realpath "$TMP")/schema.json
ok: <stdin>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ok: <stdin>
ok: /dev/stdin

&& EXIT_CODE="$?" || EXIT_CODE="$?"
{"$schema":"https://json-schema.org/draft/2020-12/schema","$ref":"https://example.com/defs"}
EOF
test "$EXIT_CODE" = "0" || exit 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
test "$EXIT_CODE" = "0" || exit 1
test "$EXIT_CODE" = "0"

Remember that || exit 1 is always redundant given set -o errexit. Worth checking the other tests too

EOF

# Schema from stdin referencing a resolved external schema
cat << 'EOF' | "$1" validate - "$TMP/instance.json" --resolve "$TMP/defs.json" \
Copy link
Member

@jviotti jviotti Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we run this one with --verbose and assert on the output? i.e. just with verbose. Not twice.

"type": "string"
}
EOF
test "$EXIT_CODE" = "0" || exit 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too. Remember no || exit 1 anywhere


cat << EOF > "$TMP/expected.txt"
ok: $(realpath "$TMP")/instance.json
matches <stdin>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
matches <stdin>
matches /dev/stdin

echo '"hello"' | "$1" validate "$TMP/schema.json" - --trace --benchmark > "$TMP/output.txt" \
&& EXIT_CODE="$?" || EXIT_CODE="$?"
test "$EXIT_CODE" = "0" || exit 1
test -s "$TMP/output.txt" || exit 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a weird test. You are just checking the file is non empty pretty much. Can you assert on its full contents? You can always re-write the numbers from the benchmark using sed or something to get a stable output to diff against


echo '"hello"' | "$1" validate "$TMP/schema.json" - --trace --benchmark > "$TMP/output.txt" \
&& EXIT_CODE="$?" || EXIT_CODE="$?"
test "$EXIT_CODE" = "0" || exit 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember the || exit 1 being unnecessary

@jviotti
Copy link
Member

jviotti commented Mar 9, 2026

Sorry for the delay. I left various comments. It's almost there. Good job on the pretty complex PR

} catch (const StdinError &error) {
const auto is_json{options.contains("json")};
print_exception(is_json, error);
return EXIT_FAILURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we have a new constants for exit codes. Can you use the right one here? Probably EXIT_INVALID_CLI_ARGUMENTS


if (options.contains("check")) {
std::ostringstream stdin_buf;
stdin_buf << std::cin.rdbuf();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No read_from_stdin() here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow all commands to take input using standard input

2 participants