Skip to content

Commit f238184

Browse files
authored
feat(file): add Compress and Decompress operations to the File block (#5100)
* feat(file): add Compress operation to bundle files into a .zip archive * feat(file): add Decompress operation to extract .zip archives Adds the inbound half of the archive pair: extracts a .zip back into the workspace with zip-slip path sanitization, symlink skipping, and entry/ size caps to bound zip-bomb expansion. Extracted files are returned in the files output, ready to chain downstream. * fix(file): align archive ops with v5 output surface and zip mime - Drop the single 'file' output reintroduced for compress/decompress; v5 intentionally exposes only 'files' (plus id/name/size/url scalars), so compress/decompress reuse the existing surface with no new block output - Add zip/gz to EXTENSION_TO_MIME (previously only in the reverse map), so archive extensions resolve to a real mime instead of octet-stream - Update File v5 block test for the two new operations * fix(file): harden compress naming per review - Flatten zip entry names to a safe basename so untrusted fileInput names with .. or / cannot produce zip-slip entry paths (cursor) - Treat archiveName as a flat name landing at the workspace root instead of passing it through splitWorkspaceFilePath, which silently created folders for names with separators (greptile) - Add the upfront empty-input guard before any DB calls, matching the read and content operations (greptile) * fix(file): make decompress extraction atomic and bound per-entry size - Read and validate every entry before writing any file, so hitting a size cap no longer leaves partially-extracted files in the workspace (cursor) - Enforce the per-entry cap on the materialized buffer in addition to the declared size, covering entries that omit an uncompressed size (cursor) - Pre-check declared sizes up front to reject standard zip bombs before materializing, and return 422 when no files could be extracted (cursor) * fix(file): exclude skipped entries from caps and reject multi-archive decompress - Resolve safe (sanitized) zip entries up front so unsafe/skipped entries no longer count toward the per-entry and total uncompressed-size caps (cursor) - Reject decompress input that resolves to more than one archive with a clear error instead of silently extracting only the first (cursor) * fix(file): enforce single-archive decompress at the API boundary The block already rejects multiple archives, but the manage route is the real boundary (callable directly and by the LLM tool) and still took the first of multiple resolved inputs. Add the empty-input and >1-archive guards in the route so extra archives are rejected with a clear error rather than silently ignored (cursor). * docs(file): correct compress description and stale file-output references - Drop the misleading 'under provider upload limits' claim from the compress tool description (models cannot read zip archives) - Fix bestPractices to reference the 'files' output, not a non-existent 'file' - Remove the stale 'file' property from the compress test fixture so it matches the real API response (greptile)
1 parent c864a92 commit f238184

9 files changed

Lines changed: 809 additions & 4 deletions

File tree

apps/sim/app/api/tools/file/manage/route.ts

Lines changed: 386 additions & 0 deletions
Large diffs are not rendered by default.

apps/sim/blocks/blocks.test.ts

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -172,7 +172,11 @@ describe.concurrent('Blocks Module', () => {
172172
'file_fetch',
173173
'file_write',
174174
'file_append',
175+
'file_compress',
176+
'file_decompress',
175177
])
178+
expect(block?.tools.config?.tool({ operation: 'file_compress' })).toBe('file_compress')
179+
expect(block?.tools.config?.tool({ operation: 'file_decompress' })).toBe('file_decompress')
176180
expect(block?.subBlocks.find((subBlock) => subBlock.id === 'readFile')?.multiple).toBe(true)
177181
expect(block?.tools.config?.tool({ operation: 'file_read' })).toBe('file_read')
178182
expect(block?.tools.config?.tool({ operation: 'file_get_content' })).toBe('file_get_content')

apps/sim/blocks/blocks/file.ts

Lines changed: 140 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -822,9 +822,9 @@ export const FileV5Block: BlockConfig<FileParserV3Output> = {
822822
...FileV4Block,
823823
type: 'file_v5',
824824
name: 'File',
825-
description: 'Read, get content, fetch, write, and append files',
825+
description: 'Read, get content, fetch, write, append, compress, and decompress files',
826826
longDescription:
827-
'Read workspace file objects, extract the text content of files, fetch and parse files from URLs with optional headers, write new workspace files, or append content to existing files.',
827+
'Read workspace file objects, extract the text content of files, fetch and parse files from URLs with optional headers, write new workspace files, append content to existing files, compress files into a .zip archive, or extract a .zip archive into the workspace.',
828828
hideFromToolbar: false,
829829
bestPractices: `
830830
- Read returns workspace file objects in the "files" output and does NOT include their text. Use it to pick files or pass file references downstream (e.g. as attachments).
@@ -833,6 +833,8 @@ export const FileV5Block: BlockConfig<FileParserV3Output> = {
833833
- Get Content's "contents" can be large; it is persisted through the execution large-value system automatically, so prefer it over inlining file text any other way.
834834
- Use Fetch for external file URLs. Add headers for authenticated downloads, for example Slack private file URLs require an Authorization Bearer token.
835835
- Use Write to create a new workspace file and Append to add content to an existing one.
836+
- Use Compress to bundle one or more files into a single .zip archive stored in the workspace. The new archive is returned in the "files" output.
837+
- Use Decompress to extract a .zip archive back into the workspace; the extracted files are returned in the "files" output, ready to chain into Get Content or downstream blocks.
836838
`,
837839
subBlocks: [
838840
{
@@ -845,6 +847,8 @@ export const FileV5Block: BlockConfig<FileParserV3Output> = {
845847
{ label: 'Fetch', id: 'file_fetch' },
846848
{ label: 'Write', id: 'file_write' },
847849
{ label: 'Append', id: 'file_append' },
850+
{ label: 'Compress', id: 'file_compress' },
851+
{ label: 'Decompress', id: 'file_decompress' },
848852
],
849853
value: () => 'file_read',
850854
},
@@ -962,9 +966,67 @@ export const FileV5Block: BlockConfig<FileParserV3Output> = {
962966
condition: { field: 'operation', value: 'file_append' },
963967
required: { field: 'operation', value: 'file_append' },
964968
},
969+
{
970+
id: 'compressFile',
971+
title: 'Files',
972+
type: 'file-upload' as SubBlockType,
973+
canonicalParamId: 'compressInput',
974+
acceptedTypes: '*',
975+
placeholder: 'Select workspace files',
976+
multiple: true,
977+
mode: 'basic',
978+
condition: { field: 'operation', value: 'file_compress' },
979+
required: { field: 'operation', value: 'file_compress' },
980+
},
981+
{
982+
id: 'compressFileId',
983+
title: 'File ID',
984+
type: 'short-input' as SubBlockType,
985+
canonicalParamId: 'compressInput',
986+
placeholder: 'Workspace file ID or JSON array of IDs',
987+
mode: 'advanced',
988+
condition: { field: 'operation', value: 'file_compress' },
989+
required: { field: 'operation', value: 'file_compress' },
990+
},
991+
{
992+
id: 'archiveName',
993+
title: 'Archive Name',
994+
type: 'short-input' as SubBlockType,
995+
placeholder: 'archive.zip (auto-named from source if omitted)',
996+
condition: { field: 'operation', value: 'file_compress' },
997+
},
998+
{
999+
id: 'decompressFile',
1000+
title: 'Archive',
1001+
type: 'file-upload' as SubBlockType,
1002+
canonicalParamId: 'decompressInput',
1003+
acceptedTypes: '.zip',
1004+
placeholder: 'Select a .zip archive',
1005+
mode: 'basic',
1006+
condition: { field: 'operation', value: 'file_decompress' },
1007+
required: { field: 'operation', value: 'file_decompress' },
1008+
},
1009+
{
1010+
id: 'decompressFileId',
1011+
title: 'File ID',
1012+
type: 'short-input' as SubBlockType,
1013+
canonicalParamId: 'decompressInput',
1014+
placeholder: 'Workspace file ID of the .zip archive',
1015+
mode: 'advanced',
1016+
condition: { field: 'operation', value: 'file_decompress' },
1017+
required: { field: 'operation', value: 'file_decompress' },
1018+
},
9651019
],
9661020
tools: {
967-
access: ['file_read', 'file_get_content', 'file_fetch', 'file_write', 'file_append'],
1021+
access: [
1022+
'file_read',
1023+
'file_get_content',
1024+
'file_fetch',
1025+
'file_write',
1026+
'file_append',
1027+
'file_compress',
1028+
'file_decompress',
1029+
],
9681030
config: {
9691031
tool: (params) => params.operation || 'file_read',
9701032
params: (params) => {
@@ -1005,6 +1067,70 @@ export const FileV5Block: BlockConfig<FileParserV3Output> = {
10051067
}
10061068
}
10071069

1070+
if (operation === 'file_compress') {
1071+
const compressInput = params.compressInput
1072+
if (!compressInput) {
1073+
throw new Error('File is required for compress')
1074+
}
1075+
1076+
const archiveName =
1077+
typeof params.archiveName === 'string' && params.archiveName.trim()
1078+
? params.archiveName.trim()
1079+
: undefined
1080+
1081+
const fileIds = parseReadFileIds(compressInput)
1082+
if (fileIds) {
1083+
return {
1084+
fileId: fileIds,
1085+
archiveName,
1086+
workspaceId: params._context?.workspaceId,
1087+
}
1088+
}
1089+
1090+
const normalized = normalizeFileInput(compressInput)
1091+
if (!normalized || normalized.length === 0) {
1092+
throw new Error('File is required for compress')
1093+
}
1094+
1095+
return {
1096+
fileInput: normalized,
1097+
archiveName,
1098+
workspaceId: params._context?.workspaceId,
1099+
}
1100+
}
1101+
1102+
if (operation === 'file_decompress') {
1103+
const decompressInput = params.decompressInput
1104+
if (!decompressInput) {
1105+
throw new Error('File is required for decompress')
1106+
}
1107+
1108+
const fileIds = parseReadFileIds(decompressInput)
1109+
if (fileIds) {
1110+
const ids = Array.isArray(fileIds) ? fileIds : [fileIds]
1111+
if (ids.length > 1) {
1112+
throw new Error('Decompress accepts a single .zip archive at a time')
1113+
}
1114+
return {
1115+
fileId: ids[0],
1116+
workspaceId: params._context?.workspaceId,
1117+
}
1118+
}
1119+
1120+
const normalized = normalizeFileInput(decompressInput)
1121+
if (!normalized || normalized.length === 0) {
1122+
throw new Error('File is required for decompress')
1123+
}
1124+
if (normalized.length > 1) {
1125+
throw new Error('Decompress accepts a single .zip archive at a time')
1126+
}
1127+
1128+
return {
1129+
fileInput: normalized[0],
1130+
workspaceId: params._context?.workspaceId,
1131+
}
1132+
}
1133+
10081134
if (operation === 'file_fetch') {
10091135
const fileUrl = resolveHttpFileUrl(params.fileUrl)
10101136

@@ -1089,11 +1215,21 @@ export const FileV5Block: BlockConfig<FileParserV3Output> = {
10891215
contentType: { type: 'string', description: 'MIME content type for write' },
10901216
appendFileInput: { type: 'json', description: 'File to append to' },
10911217
appendContent: { type: 'string', description: 'Content to append to file' },
1218+
compressInput: {
1219+
type: 'json',
1220+
description: 'Selected workspace files or canonical file IDs to compress',
1221+
},
1222+
archiveName: { type: 'string', description: 'Name for the compressed .zip archive' },
1223+
decompressInput: {
1224+
type: 'json',
1225+
description: 'Selected .zip archive or canonical file ID to extract',
1226+
},
10921227
},
10931228
outputs: {
10941229
files: {
10951230
type: 'file[]',
1096-
description: 'Workspace file objects (read) or fetched file objects (fetch)',
1231+
description:
1232+
'Workspace file objects (read), fetched file objects (fetch), the compressed archive (compress), or extracted files (decompress)',
10971233
},
10981234
contents: {
10991235
type: 'array',

apps/sim/lib/api/contracts/tools/file.ts

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,13 +64,38 @@ export const fileManageContentBodySchema = z
6464
message: 'Either fileId or fileInput is required for content operation',
6565
})
6666

67+
export const fileManageCompressBodySchema = z
68+
.object({
69+
operation: z.literal('compress'),
70+
workspaceId: z.string().min(1).optional(),
71+
fileId: z.union([z.string().min(1), z.array(z.string().min(1)).min(1)]).optional(),
72+
fileInput: z.unknown().optional(),
73+
archiveName: z.string().min(1).max(255).optional(),
74+
})
75+
.refine((data) => data.fileId !== undefined || data.fileInput !== undefined, {
76+
message: 'Either fileId or fileInput is required for compress operation',
77+
})
78+
79+
export const fileManageDecompressBodySchema = z
80+
.object({
81+
operation: z.literal('decompress'),
82+
workspaceId: z.string().min(1).optional(),
83+
fileId: z.string().min(1).optional(),
84+
fileInput: z.unknown().optional(),
85+
})
86+
.refine((data) => data.fileId !== undefined || data.fileInput !== undefined, {
87+
message: 'Either fileId or fileInput is required for decompress operation',
88+
})
89+
6790
export const fileManageBodySchema = z.union([
6891
fileManageWriteBodySchema,
6992
fileManageAppendBodySchema,
7093
fileManageGetBodySchema,
7194
fileManageMoveBodySchema,
7295
fileManageReadBodySchema,
7396
fileManageContentBodySchema,
97+
fileManageCompressBodySchema,
98+
fileManageDecompressBodySchema,
7499
])
75100

76101
export const fileManageContract = defineRouteContract({

apps/sim/lib/uploads/utils/file-utils.ts

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -241,6 +241,10 @@ const EXTENSION_TO_MIME: Record<string, string> = {
241241
yml: 'application/x-yaml',
242242
rtf: 'application/rtf',
243243

244+
// Archives
245+
zip: 'application/zip',
246+
gz: 'application/gzip',
247+
244248
// Code / plain-text source
245249
py: 'text/x-python',
246250
js: 'text/javascript',
Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
/**
2+
* @vitest-environment node
3+
*/
4+
import { describe, expect, it } from 'vitest'
5+
import { fileCompressTool, fileDecompressTool } from '@/tools/file/compress'
6+
7+
describe('fileCompressTool', () => {
8+
it('builds a compress request body from file IDs and archive name', () => {
9+
const body = fileCompressTool.request.body?.({
10+
fileId: ['wf_a', 'wf_b'],
11+
archiveName: 'documents.zip',
12+
_context: { workspaceId: 'ws_1' },
13+
} as Parameters<NonNullable<typeof fileCompressTool.request.body>>[0])
14+
15+
expect(body).toMatchObject({
16+
operation: 'compress',
17+
fileId: ['wf_a', 'wf_b'],
18+
archiveName: 'documents.zip',
19+
workspaceId: 'ws_1',
20+
})
21+
})
22+
23+
it('forwards a selected file object when no IDs are provided', () => {
24+
const fileInput = { id: 'wf_c', name: 'report.pdf' }
25+
const body = fileCompressTool.request.body?.({
26+
fileInput,
27+
workspaceId: 'ws_2',
28+
} as Parameters<NonNullable<typeof fileCompressTool.request.body>>[0])
29+
30+
expect(body).toMatchObject({
31+
operation: 'compress',
32+
fileInput,
33+
workspaceId: 'ws_2',
34+
})
35+
})
36+
37+
it('returns the compressed archive on success', async () => {
38+
const archive = {
39+
id: 'wf_zip',
40+
name: 'archive.zip',
41+
size: 1024,
42+
url: 'https://example.com/archive.zip',
43+
type: 'application/zip',
44+
key: 'workspace/ws_1/archive.zip',
45+
}
46+
47+
const result = await fileCompressTool.transformResponse?.(
48+
Response.json({
49+
success: true,
50+
data: {
51+
id: archive.id,
52+
name: archive.name,
53+
size: archive.size,
54+
url: archive.url,
55+
files: [archive],
56+
},
57+
})
58+
)
59+
60+
expect(result).toMatchObject({
61+
success: true,
62+
output: { id: 'wf_zip', name: 'archive.zip', size: 1024, files: [archive] },
63+
})
64+
})
65+
66+
it('propagates route failures as tool failures', async () => {
67+
const result = await fileCompressTool.transformResponse?.(
68+
Response.json({ success: false, error: 'Combined input is too large to compress.' })
69+
)
70+
71+
expect(result).toMatchObject({
72+
success: false,
73+
error: 'Combined input is too large to compress.',
74+
output: {},
75+
})
76+
})
77+
})
78+
79+
describe('fileDecompressTool', () => {
80+
it('builds a decompress request body from a file ID', () => {
81+
const body = fileDecompressTool.request.body?.({
82+
fileId: 'wf_zip',
83+
_context: { workspaceId: 'ws_1' },
84+
} as Parameters<NonNullable<typeof fileDecompressTool.request.body>>[0])
85+
86+
expect(body).toMatchObject({
87+
operation: 'decompress',
88+
fileId: 'wf_zip',
89+
workspaceId: 'ws_1',
90+
})
91+
})
92+
93+
it('returns the extracted files on success', async () => {
94+
const extracted = [
95+
{ id: 'wf_a', name: 'a.txt', url: 'https://example.com/a.txt', key: 'k/a.txt' },
96+
{ id: 'wf_b', name: 'b.txt', url: 'https://example.com/b.txt', key: 'k/b.txt' },
97+
]
98+
99+
const result = await fileDecompressTool.transformResponse?.(
100+
Response.json({ success: true, data: { files: extracted } })
101+
)
102+
103+
expect(result).toMatchObject({
104+
success: true,
105+
output: { files: extracted },
106+
})
107+
})
108+
109+
it('propagates route failures as tool failures', async () => {
110+
const result = await fileDecompressTool.transformResponse?.(
111+
Response.json({ success: false, error: '"data.txt" is not a valid .zip archive' })
112+
)
113+
114+
expect(result).toMatchObject({
115+
success: false,
116+
error: '"data.txt" is not a valid .zip archive',
117+
output: {},
118+
})
119+
})
120+
})

0 commit comments

Comments
 (0)