Skip to content

Conversation

@goloroden
Copy link
Member

Summary

This PR adds support for converting event streams to pandas DataFrames, enabling data analysis and exploration of events.

Key changes:

  • New eventsourcingdb.pandas module with events_to_dataframe() function
  • Converts AsyncGenerator[Event, None] to pd.DataFrame with all event fields
  • Pandas added as optional dependency (pip install eventsourcingdb[pandas])
  • Comprehensive test suite (8 tests) following TDD approach
  • README documentation with usage examples and flattening guide

Design decisions:

  • All event fields included as DataFrame columns (13 columns total)
  • data field remains as dict for flexibility
  • Optional flattening on-demand using pd.json_normalize()
  • Minimal-invasive add-on that doesn't affect core functionality

Example Usage

from eventsourcingdb import Client, ReadEventsOptions
from eventsourcingdb.pandas import events_to_dataframe

events = client.read_events(
  subject = '/books',
  options = ReadEventsOptions(recursive = True)
)

df = await events_to_dataframe(events)

Test Plan

  • ✅ Empty event stream returns empty DataFrame with correct columns
  • ✅ Single event conversion
  • ✅ Multiple events conversion
  • ✅ Column names validation
  • ✅ Data types validation (datetime for time field, dict for data)
  • ✅ Optional fields (trace_parent, trace_state, signature) handling
  • ✅ All event fields present in DataFrame

CI/CD

Tests will run automatically via GitHub Actions. The PR includes:

  • pytest tests for all scenarios
  • ruff linting compliance
  • pyright type checking
  • bandit security scanning

🤖 Generated with Claude Code

Add support for converting event streams to pandas DataFrames for
data analysis and exploration. The events_to_dataframe() function
accepts an AsyncGenerator of events and returns a DataFrame with
all event fields as columns.

- Add eventsourcingdb.pandas module with events_to_dataframe() function
- Add comprehensive test suite following TDD approach
- Add pandas as optional dependency group
- Update README with usage examples and flattening guide

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@goloroden goloroden requested a review from a team as a code owner November 13, 2025 00:53
goloroden and others added 7 commits November 13, 2025 01:55
When converting an empty event stream, the DataFrame now includes
all expected columns instead of being completely empty. This is
achieved by explicitly defining the columns parameter when creating
the DataFrame.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Use conditional logic to create DataFrame with columns parameter
only when the event list is empty, avoiding type checking issues
with pyright while maintaining correct behavior.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Add type: ignore annotation for the columns parameter when creating
an empty DataFrame, as pyright's pandas stubs don't recognize
list[str] as a valid type despite it working correctly at runtime.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Set DataFrame columns using pd.Index after creation instead of
passing columns parameter to constructor, resolving pyright type
checking errors while maintaining correct functionality.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Revert to using columns parameter with correct pyright ignore syntax
(reportArgumentType) instead of attempting to set columns property
after creation, which fails for empty DataFrames.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@goloroden goloroden merged commit 6158512 into main Nov 13, 2025
4 checks passed
@goloroden goloroden deleted the feat/pandas-dataframe-integration branch November 13, 2025 01:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants