Skip to content

Refactor Langfuse dataset upload#32

Merged
fcogidi merged 2 commits intomainfrom
fco/refactor_langfuse_upload
Feb 4, 2026
Merged

Refactor Langfuse dataset upload#32
fcogidi merged 2 commits intomainfrom
fco/refactor_langfuse_upload

Conversation

@fcogidi
Copy link
Collaborator

@fcogidi fcogidi commented Feb 4, 2026

Summary

Refactors Langfuse dataset upload to support both JSONL and JSON inputs (with format auto-detection), adds a reusable Rich progress utility for progress UX, and adds focused unit tests for upload parsing/validation behaviour.

Clickup Ticket(s): N/A

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📝 Documentation update
  • 🔧 Refactoring (no functional changes)
  • ⚡ Performance improvement
  • 🧪 Test improvements
  • 🔒 Security fix

Changes Made

  • Added reusable progress helpers in aieng-eval-agents/aieng/agent_evals/progress.py (create_progress and track_with_progress) and exported them in aieng-eval-agents/aieng/agent_evals/__init__.py.
  • Refactored upload_dataset_to_langfuse in aieng-eval-agents/aieng/agent_evals/langfuse.py to:
    • auto-detect JSON vs JSONL,
    • parse JSON arrays and line-by-line JSONL,
    • validate required keys (input, expected_output),
    • surface line-aware JSONL parse errors,
    • normalize metadata (id fallback handling),
    • show upload progress during dataset item creation.
  • Added unit tests in aieng-eval-agents/tests/aieng/agent_evals/test_langfuse.py covering JSON upload, JSONL upload, malformed JSONL handling, missing required fields, and progress helper smoke checks.

Testing

  • Tests pass locally (uv run pytest tests/)
  • Type checking passes (uv run mypy <src_dir>)
  • Linting passes (uv run ruff check src_dir/)
  • Manual testing performed (describe below)

Manual testing details:
N/A

Screenshots/Recordings

N/A

Related Issues

N/A

Deployment Notes

Checklist

  • Code follows the project's style guidelines
  • Self-review of code completed
  • Documentation updated (if applicable)
  • No sensitive information (API keys, credentials) exposed

- Introduced `create_progress` and `track_with_progress` for consistent progress tracking.
- Updated `upload_dataset_to_langfuse` to support JSON and JSONL formats with improved error handling.
- Added tests for dataset upload functionality, ensuring proper metadata handling and error reporting.
@fcogidi fcogidi requested review from amrit110, Copilot and lotif February 4, 2026 15:52
@fcogidi fcogidi self-assigned this Feb 4, 2026
@fcogidi fcogidi added enhancement New feature or request refactor Refactor or clean up code structure labels Feb 4, 2026
@fcogidi fcogidi changed the title Refactor Langfuse dataset update Refactor Langfuse dataset upload Feb 4, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the Langfuse dataset upload functionality to support both JSON and JSONL formats with automatic format detection, adds reusable Rich progress utilities for improved user experience, and includes comprehensive unit tests for the new parsing and validation logic.

Changes:

  • Added reusable progress bar utilities (create_progress and track_with_progress) for consistent progress visualization
  • Enhanced upload_dataset_to_langfuse to auto-detect format, validate records, and provide better error messages with line numbers
  • Added focused unit tests covering JSON/JSONL upload, malformed data handling, and progress helper functionality

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
aieng-eval-agents/aieng/agent_evals/progress.py Introduces reusable Rich progress utilities with standardized column layout
aieng-eval-agents/aieng/agent_evals/langfuse.py Refactors dataset upload with format detection, improved validation, and progress tracking
aieng-eval-agents/aieng/agent_evals/init.py Exports new progress utilities for public API
aieng-eval-agents/tests/aieng/agent_evals/test_langfuse.py Adds comprehensive unit tests for upload parsing, validation, and progress helpers

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@amrit110 amrit110 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the one comment! Overall looks good.

@fcogidi fcogidi requested a review from amrit110 February 4, 2026 16:58
@fcogidi fcogidi merged commit 1d6f106 into main Feb 4, 2026
3 checks passed
@fcogidi fcogidi deleted the fco/refactor_langfuse_upload branch February 4, 2026 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request refactor Refactor or clean up code structure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants