Add storage-cli tooling crate with stress-test upload#220
Add storage-cli tooling crate with stress-test upload#220darwinsubramaniam wants to merge 11 commits into
stress-test upload#220Conversation
| pub enum StressTest { | ||
| /// Upload generated data to every bucket the account already has an | ||
| /// agreement with the given provider for. | ||
| Upload(UploadArgs), |
There was a problem hiding this comment.
Maybe just Upload -> ProviderUpload?
| Upload(UploadArgs), | |
| ProviderUpload(UploadArgs), |
There was a problem hiding this comment.
Provider name here maybe conflict with the role of provider-node .. would it be more distinguishable if call it as ClientUploud. Client here refer to the account which wants to upload data into the provider.
There was a problem hiding this comment.
@bkontur - please review the comment on the naming
bash command : storage-cli stress-test

the do you want the cli subcommand under the stress-test to be upload or provider-upload? For now i kept it for the end user as upload as it was stated in the Issue #175 . If the provider-upload makes sense here I am ok to change the override.
|
@darwinsubramaniam also it would be cool to add this tool and all its commands to the CI |
|
@bkontur , Ok will add that part as well. |
cd5cedd to
4f8a0cd
Compare
| // Off-chain HTTP uploads (no chain, no signer). Constant-fill payload,. | ||
| let user = StorageUserClient::new(config).context("failed to construct provider client")?; | ||
| let payload = vec![0x42; args.size]; | ||
|
|
||
| for bucket in &selected_buckets_id { | ||
| let data_root = user | ||
| .upload(*bucket, &payload, ChunkingStrategy::default()) | ||
| .await | ||
| .with_context(|| format!("upload to bucket {bucket} failed"))?; | ||
| println!( | ||
| " bucket {bucket}: uploaded {} bytes, data_root 0x{}", | ||
| payload.len(), | ||
| hex::encode(data_root.as_bytes()), | ||
| ); | ||
| } |
There was a problem hiding this comment.
This is the most important part of this stress-tess, based on the configuration, we should have possibilities:
- configure number of users (1..N) (default 1)
- configure max payload size (by default 0.5 MiB) - (it will generate random payloads)
- configure mode(s): ParallelUsers(1..N) + ParallelUploads(1..X)
The idea is to have ability specify different scenarios just by configuration:
- 1 user with 1000 sequence uploads with max size 0.5 MiB
- 1 user with 100 parallel uploads with max size 0.5 MiB
- 10 parallel users with 100 parallel uploads with max 0.5 MiB
- (just a note: I know, we can run also run the stress-test binaries in parallel)
| // Off-chain HTTP uploads (no chain, no signer). Constant-fill payload,. | |
| let user = StorageUserClient::new(config).context("failed to construct provider client")?; | |
| let payload = vec![0x42; args.size]; | |
| for bucket in &selected_buckets_id { | |
| let data_root = user | |
| .upload(*bucket, &payload, ChunkingStrategy::default()) | |
| .await | |
| .with_context(|| format!("upload to bucket {bucket} failed"))?; | |
| println!( | |
| " bucket {bucket}: uploaded {} bytes, data_root 0x{}", | |
| payload.len(), | |
| hex::encode(data_root.as_bytes()), | |
| ); | |
| } | |
| // TODO: list of parallel users | |
| // Off-chain HTTP uploads (no chain, no signer). Constant-fill payload,. | |
| let user = StorageUserClient::new(config).context("failed to construct provider client")?; | |
| // TODO: dynamic random content with max size by configuration | |
| let payload = vec![0x42; args.size]; | |
| // TODO": make this run in parallel in case of configuration | |
| for bucket in &selected_buckets_id { | |
| let data_root = user | |
| .upload(*bucket, &payload, ChunkingStrategy::default()) | |
| .await | |
| .with_context(|| format!("upload to bucket {bucket} failed"))?; | |
| println!( | |
| " bucket {bucket}: uploaded {} bytes, data_root 0x{}", | |
| payload.len(), | |
| hex::encode(data_root.as_bytes()), | |
| ); | |
| } |
There was a problem hiding this comment.
For cases involving many users where a many extrinsics need to be invoked, you can consider using pallet_utility to submit batched transactions. This improves overall execution efficiency. However, keep in mind the block weight and size limits when constructing large batches.
There was a problem hiding this comment.
For cases involving many users where a many extrinsics need to be invoked, you can consider using pallet_utility to submit batched transactions. This improves overall execution efficiency. However, keep in mind the block weight and size limits when constructing large batches.
@danielbui12 what do you mean by using pallet_utility here? This is just stretching provider upload RPC, no on-chain call, no transaction.
| /// This resolves targets from chain (`MemberBuckets[account]` ∩ buckets with a | ||
| /// `StorageAgreements[bucket][provider]` entry) and never creates buckets or | ||
| /// agreements — if nothing matches, it errors out. | ||
| pub async fn upload(global: &GlobalArgs, args: &UploadArgs) -> Result<()> { |
There was a problem hiding this comment.
@darwinsubramaniam at the end, it should at least print some metrics (at least some times, size uploaded, ...). On the other hand (outside of the scope of this PR), we will collect internal statistics #214
Add `utils/storage-cli`, a clap-based operator CLI built on the `storage-client` SDK, starting with a `stress-test upload` subcommand. - New workspace member `utils/storage-cli`; promote `clap` to a workspace dependency (version only, features set per-crate). - `stress-test upload` resolves target buckets from chain (MemberBuckets[account] intersected with buckets that have a StorageAgreements[bucket][provider] entry) and uploads generated data over the provider HTTP API. It never creates buckets or agreements and errors clearly when no matching buckets exist. - Reuses AdminClient (chain reads) and StorageUserClient (HTTP upload); no duplicated chain or HTTP logic.
…sion to workspace and added license header, based on code review
1acd933 to
c582a40
Compare
…mmary Replace the single constant-fill upload per bucket with load driven entirely by configuration: - --users / --uploads-per-user / --max-payload-size (random payloads) - --parallel-users and --parallel-uploads axes, plus --max-concurrency cap - targets buckets the account already has an agreement with the provider for Add a reusable metrics module (OpOutcome / OpSummary / summarize) tagged by Operation, so a scenario computes and returns aggregate metrics (counts, bytes, throughput, latency) and main owns viewing them, with --output text|json. Wire the new flags into the integration-tests workflow.
Model the operation kind as a trait whose implementors supply their display labels, replacing a closed enum with three exhaustive match methods and a pre-declared Read/Delete that required #[allow(dead_code)]. OpSummary now stores the resolved OpLabels and summarize() is generic over the operation, so metrics.rs is fully operation-agnostic: a new operation is a self-contained impl with no central list to extend. The Upload marker lives in the stress-test command for now. Re-architecture: storage-cli-operations
…dule Split the upload operation out of the stress-test command into a new top-level `actions` module. actions::upload owns the Upload marker and the upload_once(client, bucket, payload) -> OpOutcome primitive; stress_test now only orchestrates (users x uploads, parallelism, concurrency cap) and composes the action. A future read/delete action is a sibling module here, usable by any scenario without touching the metrics layer. Also moves BucketId into `common`, de-duplicates the user dispatch behind a single run_one closure, and switches a panicked user task from aborting the whole run (bail!) to warn-and-continue so partial metrics still reach `main`. Re-architecture: storage-cli-operations
| /// This resolves targets from chain (`MemberBuckets[account]` ∩ buckets with a | ||
| /// `StorageAgreements[bucket][provider]` entry) and never creates buckets or | ||
| /// agreements — if nothing matches, it errors out. | ||
| pub async fn upload(global: &GlobalArgs, args: &UploadArgs) -> Result<()> { |
There was a problem hiding this comment.
Ideally, it should support e2e flow: ensure provider registered -> create bucket -> upload data. You can see many examples at ./examples/papi/e2e
There was a problem hiding this comment.
Ideally, it should support e2e flow: ensure provider registered -> create bucket -> upload data. You can see many examples at
./examples/papi/e2e
hard to say, e2e could be another command, I am sure that for other issues we will want to focus on optimizing just particular stuff - like fast upload, fast download, so we would want to stress just those parts, not the whole flow
There was a problem hiding this comment.
Hi @bkontur and @danielbui12
I hope i get clear idea here.
The upload function here will check if there is any agreement belongs to the requester on the targeted provider. If found upload , if non just bail from the test as no agreement made. So the agreement is an assumption , as pre-setup for now.
Do you want this pre-setup stage to be also automated?
Sorry for the late reply, I am current in my master's exam period. So I might be slow to response and get it moving, but I am keen on completing this PR
| # Seed: //Bob negotiates terms and opens a bucket + primary agreement | ||
| # against the //Alice provider, giving the CLI something to upload to. | ||
| - name: Seed bucket + agreement (//Bob → //Alice provider) | ||
| run: | | ||
| cargo run --release -p storage-client --example complete_workflow \ | ||
| ws://127.0.0.1:2222 http://127.0.0.1:3333 //Bob |
There was a problem hiding this comment.
It is better to implement the logic directly in the crate/module. These examples may change over time, and duplicating logic between the examples and the crate/module can increase CI execution time and maintenance overhead
There was a problem hiding this comment.
##220 (comment)
@bkontur and @danielbui12
This is the pre-setup stage , which i just mentioned . So this particular pre-setup move it into stage the Upload function as default ?
| // Off-chain HTTP uploads (no chain, no signer). Constant-fill payload,. | ||
| let user = StorageUserClient::new(config).context("failed to construct provider client")?; | ||
| let payload = vec![0x42; args.size]; | ||
|
|
||
| for bucket in &selected_buckets_id { | ||
| let data_root = user | ||
| .upload(*bucket, &payload, ChunkingStrategy::default()) | ||
| .await | ||
| .with_context(|| format!("upload to bucket {bucket} failed"))?; | ||
| println!( | ||
| " bucket {bucket}: uploaded {} bytes, data_root 0x{}", | ||
| payload.len(), | ||
| hex::encode(data_root.as_bytes()), | ||
| ); | ||
| } |
There was a problem hiding this comment.
For cases involving many users where a many extrinsics need to be invoked, you can consider using pallet_utility to submit batched transactions. This improves overall execution efficiency. However, keep in mind the block weight and size limits when constructing large batches.
| - name: Register provider on-chain | ||
| run: | | ||
| echo "//Alice" > /tmp/alice-key && chmod 600 /tmp/alice-key | ||
| cargo run --release -p storage-client --example register_provider \ | ||
| ws://127.0.0.1:2222 http://127.0.0.1:3333 /ip4/127.0.0.1/tcp/3333 /tmp/alice-key |
There was a problem hiding this comment.
Ref: https://github.com/paritytech/web3-storage/pull/220/changes#r3471870419
Recently I dropped all provider registration jobs because new module chain_state_coordinator will keep the provider node up-to-date with runtime #207
…/darwinsubramaniam/web3-storage into feat/storage-cli-stress-test-175

Summary
Adds
utils/storage-cli, a clap-based operator CLI built on thestorage-clientSDK, starting with astress-test uploadsubcommand.Changes
utils/storage-cli; promotesclapto a workspace dependency (version only, features set per-crate).stress-test uploadresolves target buckets from chain (MemberBuckets[account]intersected with buckets that have aStorageAgreements[bucket][provider]entry) and uploads generated data over the provider HTTP API. It never creates buckets or agreements, and errors clearly when no matching buckets exist.AdminClient(chain reads) andStorageUserClient(HTTP upload); no duplicated chain or HTTP logic.utils/storage-cli/README.mddocumenting usage.Related to #175
Notes for reviewers
utils/storage-cli/README.mdfor invocation examples.