[last updated on January 12, 2020; not complete yet]
Data Augmentation:
- Research Guide: Data Augmentation for Deep Learning, [Nearly] Everything you need to know in 2019 [link], keywords: Random Erasing Data Augmentation (2017), AutoAugment: Learning Augmentation Strategies from Data (CVPR 2019), Fast AutoAugment (2019), Learning Data Augmentation Strategies for Object Detection (2019), SpecAugment: for Automatic Speech Recognition (Interspeech 2019), EDA: for Boosting Performance on Text Classification Tasks (EMNLP-IJCNLP 2019), Unsupervised Data Augmentation for Consistency Training (2019)
- Data augmentation on entire dataset before splitting [link], conclusion: this practice is incorrect.
- How does data augmentation reduce overfitting? [link]
Data Augmentation for Regression Tasks:
Online articles that mentioned DA for regression tasks:
- Shehroz Khan's answer to What does the term data augmentation mean in the context of machine learning? [link]
- What you need to know about data augmentation for machine learning [link]
- Data augmentation techniques for general datasets? [link] (Teng: To me, it seems they were discussing feature engineering instead of adding more data points.)
- Data Augmentation Techniques for Cat/Binary/Continuous Numerical Dataset [link], keywords: SMOTE
Data Augmentation for Unbalanced Dataset in Classification Tasks:
- Oversampling and undersampling in data analysis [link]
- imbalanced-learn [GitHub] [docs]
- A collection of 85 minority oversampling techniques (SMOTE) for imbalanced learning with multi-class oversampling and model selection features [docs] [GitHub]
- A Deep Dive Into Imbalanced Data: Over-Sampling [link]
- SMOTE for high-dimensional class-imbalanced data, Rok Blagus and Lara Lusa, 2013 [link]
- SMOTE explained for noobs - Synthetic Minority Over-sampling TEchnique line by line [link]
- Detecting representative data and generating synthetic samples to improve learning accuracy with imbalanced data sets [link]
- ADASYN: Adaptive Synthetic Sampling Method for Imbalanced Data [link]
Data Augmentation for Image:
- Data Augmentation for Deep Learning [link], keywords: image augmentation packages, PyTorch framework
- 1000x Faster Data Augmentation [link], keywords: learn augmentation policies, Population Based Augmentation, Tune Framework
- A survey on Image Data Augmentation for Deep Learning, Connor Shorten and Taghi M. Khoshgoftaar [link]
- Python | Data Augmentation [link]
- How to Configure Image Data Augmentation in Keras [link]
- Data Augmentation | How to use Deep Learning when you have Limited Data -- Part 2 [link], keywords: online augmentation, offline augmentation
- Data augmentation for improving deep learning in image classification problem, Mikolajczyk et al. [link]
- The Effectiveness of Data Augmentation in Image Classification using Deep Learning, Jason Wang and Luis Perez [link]
Data Augmentation for Audio:
to be added ...
Data Augmentation for Texts:
- These are the Easiest Data Augmentation Techniques in Natural Language Processing you can think of -- and they work. Simple text editing techniques can make huge performance gains for small datasets. [link]
Data Augmentation for Time Series
- Data Augmentation strategies for Time Series Forecasting [link]