Skip to content

Commit

Permalink
Merge pull request #37 from eXascaleInfolab/naterq_memory
Browse files Browse the repository at this point in the history
naterq memory
  • Loading branch information
qnater authored Nov 22, 2024
2 parents 3e1485b + 058881e commit 4ba842f
Show file tree
Hide file tree
Showing 147 changed files with 1,415 additions and 151 deletions.
Binary file modified .coverage
Binary file not shown.
4 changes: 3 additions & 1 deletion .github/workflows/pytest_loading.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ jobs:
sudo apt-get install libmlpack-dev
sudo apt-get install libopenblas-dev
sudo apt-get install python3-dev build-essential
pip install --upgrade google protobuf
pip install -r requirements.txt
pip install mypy
pip install pytest
Expand All @@ -32,4 +33,5 @@ jobs:
- name: Run pytest
run: |
python -m pytest ./tests/test_loading.py
python -m pytest ./tests/test_exception.py
python -m pytest ./tests/test_exception.py
python -m pytest ./tests/test_benchmarking.py
127 changes: 56 additions & 71 deletions .idea/workspace.xml

Large diffs are not rendered by default.

9 changes: 9 additions & 0 deletions LICENSE.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
MIT License

Copyright (c) 2019-2024 UNIFR

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@

# Welcome to ImputeGAP

ImputeGAP is a comprehensive framework designed for time series imputation algorithms. It offers a streamlined interface that bridges algorithm evaluation and parameter tuning, utilizing datasets from diverse fields such as neuroscience, medicine, and energy. The framework includes advanced imputation algorithms from five different families, supports various patterns of missing data, and provides multiple evaluation metrics. Additionally, ImputeGAP enables AutoML optimization, feature extraction, and feature analysis with SHAP. The framework is built for easy integration of new algorithms, datasets, and evaluation metrics, enhancing its flexibility and adaptability.
ImputeGAP is a comprehensive framework designed for time series imputation algorithms. It offers a streamlined interface that bridges algorithm evaluation and parameter tuning, utilizing datasets from diverse fields such as neuroscience, medicine, and energy. The framework includes advanced imputation algorithms from five different families, supports various patterns of missing data, and provides multiple evaluation metrics. Additionally, ImputeGAP enables AutoML optimization, feature extraction, and feature analysis. The framework enables easy integration of new algorithms, datasets, and evaluation metrics.

![Python](https://img.shields.io/badge/Python-v3.12-blue)
![Release](https://img.shields.io/badge/Release-v0.2.2-brightgreen)
![License](https://img.shields.io/badge/License-GPLv3-blue?style=flat&logo=gnu)
![Coverage](https://img.shields.io/badge/Coverage-93%25-brightgreen)
![Coverage](https://img.shields.io/badge/Coverage-91%25-brightgreen)
![PyPI](https://img.shields.io/pypi/v/imputegap?label=PyPI&color=blue)
![Language](https://img.shields.io/badge/Language-English-blue)
![Platform](https://img.shields.io/badge/platform-Windows%20%7C%20Linux%20%7C%20MacOS-informational)
Expand Down Expand Up @@ -251,9 +251,9 @@ To add your own imputation algorithm in Python or C++, please refer to the detai

## References

Mourad Khayati, Quentin Nater, and Jacques Pasquier. <b>“ImputeVIS: An Interactive Evaluator to Benchmark Imputation Techniques for Time Series Data.”</b> Proceedings of the VLDB Endowment (PVLDB). Demo Track 17, no. 1 (2024): 4329–32.
Mourad Khayati, Quentin Nater, and Jacques Pasquier. “ImputeVIS: An Interactive Evaluator to Benchmark Imputation Techniques for Time Series Data.” Proceedings of the VLDB Endowment (PVLDB). Demo Track 17, no. 1 (2024): 4329–32.

Mourad Khayati, Alberto Lerner, Zakhar Tymchenko, and Philippe Cudre-Mauroux. <b>“Mind the Gap: An Experimental Evaluation of Imputation of Missing Values Techniques in Time Series.”</b> In Proceedings of the VLDB Endowment (PVLDB), Vol. 13, 2020.
Mourad Khayati, Alberto Lerner, Zakhar Tymchenko, and Philippe Cudre-Mauroux. “Mind the Gap: An Experimental Evaluation of Imputation of Missing Values Techniques in Time Series.” In Proceedings of the VLDB Endowment (PVLDB), Vol. 13, 2020.


---
Expand Down
Binary file added assets/TimeSeriesData_plot.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/test_plot.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion build/lib/imputegap/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.2.2"
__version__ = "1.0.1"
4 changes: 2 additions & 2 deletions docs/generation/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@
html_css_files = ['custom.css']

# Set the version and release info
version = '0.2.2'
release = '0.2.2'
version = '1.0.1'
release = '1.0.1'


# You can also add links to edit the documentation on GitHub
Expand Down
2 changes: 1 addition & 1 deletion imputegap.egg-info/PKG-INFO
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Metadata-Version: 2.1
Name: imputegap
Version: 0.2.2
Version: 1.0.1
Summary: A Library of Imputation Techniques for Time Series Data
Home-page: https://github.com/eXascaleInfolab/ImputeGAP
Author: Quentin Nater
Expand Down
2 changes: 1 addition & 1 deletion imputegap/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.2.2"
__version__ = "1.0.1"
Binary file modified imputegap/__pycache__/__init__.cpython-312.pyc
Binary file not shown.
Binary file modified imputegap/assets/shap/chlorine_cdrec_DTL_Beeswarm.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified imputegap/assets/shap/chlorine_cdrec_DTL_Waterfall.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
32 changes: 16 additions & 16 deletions imputegap/assets/shap/chlorine_cdrec_results.txt
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
Feature : 6 cdrec with a score of 58.98 Geometry Proportion of high incremental changes in the series MD_hrv_classic_pnn40
Feature : 5 cdrec with a score of 9.1 Correlation Time reversibility CO_trev_1_num
Feature : 2 cdrec with a score of 6.02 Correlation First 1/e crossing of the ACF CO_f1ecac
Feature : 1 cdrec with a score of 4.52 Geometry 10-bin histogram mode DN_HistogramMode_10
Feature : 10 cdrec with a score of 4.42 Geometry Goodness of exponential fit to embedding distance distribution CO_Embed2_Dist_tau_d_expfit_meandiff
Feature : 15 cdrec with a score of 4.1 Transformation Power in the lowest 20% of frequencies SP_Summaries_welch_rect_area_5_1
Feature : 21 cdrec with a score of 3.56 Trend Error of 3-point rolling mean forecast FC_LocalSimple_mean3_stderr
Feature : 12 cdrec with a score of 3.44 Correlation Change in autocorrelation timescale after incremental differencing FC_LocalSimple_mean1_tauresrat
Feature : 0 cdrec with a score of 2.64 Geometry 5-bin histogram mode DN_HistogramMode_5
Feature : 17 cdrec with a score of 1.41 Trend Entropy of successive pairs in symbolized series SB_MotifThree_quantile_hh
Feature : 4 cdrec with a score of 0.86 Correlation Histogram-based automutual information (lag 2, 5 bins) CO_HistogramAMI_even_2_5
Feature : 8 cdrec with a score of 0.48 Geometry Transition matrix column variance SB_TransitionMatrix_3ac_sumdiagcov
Feature : 13 cdrec with a score of 0.46 Geometry Positive outlier timing DN_OutlierInclude_p_001_mdrmd
Feature : 14 cdrec with a score of 0.01 Geometry Negative outlier timing DN_OutlierInclude_n_001_mdrmd
Feature : 3 cdrec with a score of 0.0 Correlation First minimum of the ACF CO_FirstMin_ac
Feature : 1 cdrec with a score of 90.54 Geometry 10-bin histogram mode DN_HistogramMode_10
Feature : 12 cdrec with a score of 3.99 Correlation Change in autocorrelation timescale after incremental differencing FC_LocalSimple_mean1_tauresrat
Feature : 5 cdrec with a score of 3.83 Correlation Time reversibility CO_trev_1_num
Feature : 18 cdrec with a score of 0.57 Geometry Rescaled range fluctuation analysis (low-scale scaling) SC_FluctAnal_2_rsrangefit_50_1_logi_prop_r1
Feature : 13 cdrec with a score of 0.37 Geometry Positive outlier timing DN_OutlierInclude_p_001_mdrmd
Feature : 3 cdrec with a score of 0.33 Correlation First minimum of the ACF CO_FirstMin_ac
Feature : 14 cdrec with a score of 0.29 Geometry Negative outlier timing DN_OutlierInclude_n_001_mdrmd
Feature : 6 cdrec with a score of 0.09 Geometry Proportion of high incremental changes in the series MD_hrv_classic_pnn40
Feature : 0 cdrec with a score of 0.0 Geometry 5-bin histogram mode DN_HistogramMode_5
Feature : 2 cdrec with a score of 0.0 Correlation First 1/e crossing of the ACF CO_f1ecac
Feature : 4 cdrec with a score of 0.0 Correlation Histogram-based automutual information (lag 2, 5 bins) CO_HistogramAMI_even_2_5
Feature : 7 cdrec with a score of 0.0 Geometry Longest stretch of above-mean values SB_BinaryStats_mean_longstretch1
Feature : 8 cdrec with a score of 0.0 Geometry Transition matrix column variance SB_TransitionMatrix_3ac_sumdiagcov
Feature : 9 cdrec with a score of 0.0 Trend Wangs periodicity metric PD_PeriodicityWang_th0_01
Feature : 10 cdrec with a score of 0.0 Geometry Goodness of exponential fit to embedding distance distribution CO_Embed2_Dist_tau_d_expfit_meandiff
Feature : 11 cdrec with a score of 0.0 Correlation First minimum of the AMI function IN_AutoMutualInfoStats_40_gaussian_fmmi
Feature : 15 cdrec with a score of 0.0 Transformation Power in the lowest 20% of frequencies SP_Summaries_welch_rect_area_5_1
Feature : 16 cdrec with a score of 0.0 Geometry Longest stretch of decreasing values SB_BinaryStats_diff_longstretch0
Feature : 18 cdrec with a score of 0.0 Geometry Rescaled range fluctuation analysis (low-scale scaling) SC_FluctAnal_2_rsrangefit_50_1_logi_prop_r1
Feature : 17 cdrec with a score of 0.0 Trend Entropy of successive pairs in symbolized series SB_MotifThree_quantile_hh
Feature : 19 cdrec with a score of 0.0 Geometry Detrended fluctuation analysis (low-scale scaling) SC_FluctAnal_2_dfa_50_1_2_logi_prop_r1
Feature : 20 cdrec with a score of 0.0 Transformation Centroid frequency SP_Summaries_welch_rect_centroid
Feature : 21 cdrec with a score of 0.0 Trend Error of 3-point rolling mean forecast FC_LocalSimple_mean3_stderr
Binary file modified imputegap/assets/shap/chlorine_cdrec_shap_aggregate_plot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified imputegap/assets/shap/chlorine_cdrec_shap_correlation_plot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified imputegap/assets/shap/chlorine_cdrec_shap_geometry_plot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified imputegap/assets/shap/chlorine_cdrec_shap_plot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified imputegap/assets/shap/chlorine_cdrec_shap_reverse_plot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified imputegap/assets/shap/chlorine_cdrec_shap_trend_plot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
22 changes: 22 additions & 0 deletions imputegap/assets/shap/fmri-objectviewing_cdrec_results.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
Feature : 1 cdrec with a score of 33.18 Geometry 10-bin histogram mode DN_HistogramMode_10
Feature : 0 cdrec with a score of 31.45 Geometry 5-bin histogram mode DN_HistogramMode_5
Feature : 6 cdrec with a score of 8.37 Geometry Proportion of high incremental changes in the series MD_hrv_classic_pnn40
Feature : 5 cdrec with a score of 7.89 Correlation Time reversibility CO_trev_1_num
Feature : 2 cdrec with a score of 4.04 Correlation First 1/e crossing of the ACF CO_f1ecac
Feature : 21 cdrec with a score of 3.72 Trend Error of 3-point rolling mean forecast FC_LocalSimple_mean3_stderr
Feature : 13 cdrec with a score of 2.65 Geometry Positive outlier timing DN_OutlierInclude_p_001_mdrmd
Feature : 17 cdrec with a score of 2.16 Trend Entropy of successive pairs in symbolized series SB_MotifThree_quantile_hh
Feature : 15 cdrec with a score of 2.02 Transformation Power in the lowest 20% of frequencies SP_Summaries_welch_rect_area_5_1
Feature : 4 cdrec with a score of 1.76 Correlation Histogram-based automutual information (lag 2, 5 bins) CO_HistogramAMI_even_2_5
Feature : 10 cdrec with a score of 1.32 Geometry Goodness of exponential fit to embedding distance distribution CO_Embed2_Dist_tau_d_expfit_meandiff
Feature : 12 cdrec with a score of 0.76 Correlation Change in autocorrelation timescale after incremental differencing FC_LocalSimple_mean1_tauresrat
Feature : 14 cdrec with a score of 0.36 Geometry Negative outlier timing DN_OutlierInclude_n_001_mdrmd
Feature : 8 cdrec with a score of 0.33 Geometry Transition matrix column variance SB_TransitionMatrix_3ac_sumdiagcov
Feature : 3 cdrec with a score of 0.0 Correlation First minimum of the ACF CO_FirstMin_ac
Feature : 7 cdrec with a score of 0.0 Geometry Longest stretch of above-mean values SB_BinaryStats_mean_longstretch1
Feature : 9 cdrec with a score of 0.0 Trend Wangs periodicity metric PD_PeriodicityWang_th0_01
Feature : 11 cdrec with a score of 0.0 Correlation First minimum of the AMI function IN_AutoMutualInfoStats_40_gaussian_fmmi
Feature : 16 cdrec with a score of 0.0 Geometry Longest stretch of decreasing values SB_BinaryStats_diff_longstretch0
Feature : 18 cdrec with a score of 0.0 Geometry Rescaled range fluctuation analysis (low-scale scaling) SC_FluctAnal_2_rsrangefit_50_1_logi_prop_r1
Feature : 19 cdrec with a score of 0.0 Geometry Detrended fluctuation analysis (low-scale scaling) SC_FluctAnal_2_dfa_50_1_2_logi_prop_r1
Feature : 20 cdrec with a score of 0.0 Transformation Centroid frequency SP_Summaries_welch_rect_centroid
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
22 changes: 22 additions & 0 deletions imputegap/assets/shap/fmri-objectviewing_iim_results.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
Feature : 0 iim with a score of 34.56 Geometry 5-bin histogram mode DN_HistogramMode_5
Feature : 1 iim with a score of 25.08 Geometry 10-bin histogram mode DN_HistogramMode_10
Feature : 4 iim with a score of 14.99 Correlation Histogram-based automutual information (lag 2, 5 bins) CO_HistogramAMI_even_2_5
Feature : 15 iim with a score of 7.09 Transformation Power in the lowest 20% of frequencies SP_Summaries_welch_rect_area_5_1
Feature : 21 iim with a score of 6.78 Trend Error of 3-point rolling mean forecast FC_LocalSimple_mean3_stderr
Feature : 5 iim with a score of 6.36 Correlation Time reversibility CO_trev_1_num
Feature : 6 iim with a score of 1.45 Geometry Proportion of high incremental changes in the series MD_hrv_classic_pnn40
Feature : 14 iim with a score of 0.94 Geometry Negative outlier timing DN_OutlierInclude_n_001_mdrmd
Feature : 10 iim with a score of 0.9 Geometry Goodness of exponential fit to embedding distance distribution CO_Embed2_Dist_tau_d_expfit_meandiff
Feature : 17 iim with a score of 0.73 Trend Entropy of successive pairs in symbolized series SB_MotifThree_quantile_hh
Feature : 2 iim with a score of 0.66 Correlation First 1/e crossing of the ACF CO_f1ecac
Feature : 8 iim with a score of 0.23 Geometry Transition matrix column variance SB_TransitionMatrix_3ac_sumdiagcov
Feature : 12 iim with a score of 0.2 Correlation Change in autocorrelation timescale after incremental differencing FC_LocalSimple_mean1_tauresrat
Feature : 13 iim with a score of 0.03 Geometry Positive outlier timing DN_OutlierInclude_p_001_mdrmd
Feature : 3 iim with a score of 0.0 Correlation First minimum of the ACF CO_FirstMin_ac
Feature : 7 iim with a score of 0.0 Geometry Longest stretch of above-mean values SB_BinaryStats_mean_longstretch1
Feature : 9 iim with a score of 0.0 Trend Wangs periodicity metric PD_PeriodicityWang_th0_01
Feature : 11 iim with a score of 0.0 Correlation First minimum of the AMI function IN_AutoMutualInfoStats_40_gaussian_fmmi
Feature : 16 iim with a score of 0.0 Geometry Longest stretch of decreasing values SB_BinaryStats_diff_longstretch0
Feature : 18 iim with a score of 0.0 Geometry Rescaled range fluctuation analysis (low-scale scaling) SC_FluctAnal_2_rsrangefit_50_1_logi_prop_r1
Feature : 19 iim with a score of 0.0 Geometry Detrended fluctuation analysis (low-scale scaling) SC_FluctAnal_2_dfa_50_1_2_logi_prop_r1
Feature : 20 iim with a score of 0.0 Transformation Centroid frequency SP_Summaries_welch_rect_centroid
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified imputegap/imputegap/assets/rawdata_plot.jpg
Binary file modified imputegap/recovery/__pycache__/benchmarking.cpython-312.pyc
Binary file not shown.
Binary file modified imputegap/recovery/__pycache__/explainer.cpython-312.pyc
Binary file not shown.
Binary file modified imputegap/recovery/__pycache__/manager.cpython-312.pyc
Binary file not shown.
Loading

0 comments on commit 4ba842f

Please sign in to comment.