Add SleepWakeClassification task for DREAMT#892
Add SleepWakeClassification task for DREAMT#892diegofariasc wants to merge 28 commits intosunlabuiuc:masterfrom
Conversation
…sic wearable features
…ke_classification_lightgbm.py
jhnwu3
left a comment
There was a problem hiding this comment.
Let me know if you have any more questions. Nice work!
| from pyhealth.tasks.sleep_wake_classification import SleepWakeClassification | ||
|
|
||
|
|
||
| class FakeEvent: |
There was a problem hiding this comment.
I've been thinking a lot about standardizing where/what some test cases would typically look like here. I like how you reverse-engineering some aspects of what goes in within PyHealth, but I think it would be much better if we followed some of the other examples. (Still working on test case best practices here as we learn what seems to scale better and what doesn't).
But, if possible instead of constructing object oriented test classes, can we just create fake tmp data, and use the explicit datasets/data types in PyHealth?
See this example:
https://github.com/sunlabuiuc/PyHealth/blob/master/tests/core/test_chestxray14.py
@EricSchrock can probably give way better advice on this. But, the tldr; we want the testing environment to mimic the real working environment as closely as possible here.
| """Binary sleep-wake classification task for DREAMT wearable recordings. | ||
| This task converts each DREAMT wearable recording into fixed-length epochs, | ||
| extracts physiological features from multiple sensor modalities, |
There was a problem hiding this comment.
Can you give a rough breakdown of what's happening here? It's a bit difficult to follow this like 10 different private functions haha. It'd be good to have it flow, something like starting from step 1 to step 9, this is what's going on, and the sequence of function calls here.
Similarly, in the doc strings, can you add an example on how a user might use DreamT here with the set_task() call here.
There was a problem hiding this comment.
It's also not clear exactly what we mean by features here. What shape is it? What do its dimensions mean?
I think a key thing about what we want to do as good practice is to explain what it is we're doing here and what are the expected outputs here.
| X_test = imputer.transform(X_test) | ||
|
|
||
| # Train a LightGBM model on the current feature subset. | ||
| train_data = lgb.Dataset(X_train, label=y_train) |
There was a problem hiding this comment.
Just curious, is it possible to create a ContraWR example with a signals-based model here? The lightgbm isn't bad, and I see the inputs in essence are functionally formatted like a table in some sense (table of signals) here, which is still pretty cool to see as an example.
But this is a signals dataset after all:
https://physionet.org/content/dreamt/2.1.0/
Contributor: Diego Farias Castro (diegof4@illinois.edu)
Type of contribution: task
Link to original paper: https://proceedings.mlr.press/v248/wang24a.html
High-level description:
This PR adds a new standalone task,
SleepWakeClassification, on top of the existingDREAMTDataset. The task supports epoch-level sleep-vs-wake prediction from multimodal wrist-worn wearable signals. It turns each DREAMT record into fixed-length epochs, extracts features from accelerometer, temperature, blood volume pulse, and electrodermal activity signals, adds temporal context features, and assigns a binary sleep/wake label to each epoch.Implementation summary:
SleepWakeClassificationinpyhealth/tasks/sleep_wake_classification.pypyhealth/tasks/__init__.pydocs/api/tasks/pyhealth.tasks.sleep_wake_classification.rstdocs/api/tasks.rstexamples/dreamt_sleep_wake_classification_lightgbm.pytests/core/test_sleep_wake_classification.pyReproducibility scope:
This PR focuses on the task side of the paper. It makes the sleep-wake prediction setting available inside PyHealth so the generated samples can be used in new experiments and ablation studies.
Task behavior:
DREAMTDatasetpatient_id,record_id,epoch_index,features, and binarylabel1; sleep stages (REM,N1,N2,N3) map to0features, and EDA-based SCR features
variance for each base feature
File guide:
pyhealth/tasks/sleep_wake_classification.py: task implementationpyhealth/tasks/__init__.py: public task exportdocs/api/tasks/pyhealth.tasks.sleep_wake_classification.rst: task docsdocs/api/tasks.rst: task index updateexamples/dreamt_sleep_wake_classification_lightgbm.py: example and ablation workflowtests/core/test_sleep_wake_classification.py: task unit tests