My Inspiration

 

 

 

(Jan 19, 2020)

Voltaire, French philosopher:

Uncertainty is an uncomfortable position. But certainty is an absurd one.

Chinese proverb (according to A. Ben-Tal and L. El-Ghaoui. Robust Optimization. Princeton University Press, 2009; I haven't found the Chinese translation.):

To be uncertain is to be uncomfortable, but to be certain is to be ridiculous.

(Jan 18, 2020)

Friedrich Nietzsche:

He who has a why to live can bear almost any how.

 

HOWTO: Work with Small Datasets

(updated on January 12, 2020; not complete yet)

 

In general:

  1. 7 Effective Ways to Deal with a Small Dataset [link]
  2. Dealing with very small datasets [link]
  3. What to do with "small" data? [link]

 

 

For Images:

  1. Breaking the curse of small datasets in Machine Learning: Part 1 [link]
  2. Breaking the curse of small data sets in Machine Learning: Part 2 [link]
  3. You can probably use deep learning even if your data isn't that big [link]
  4. Applying deep learning to real-world problems [link]

HOWTO: Data Augmentation

[last updated on January 12, 2020; not complete yet]

 

Data Augmentation:

  1. Research Guide: Data Augmentation for Deep Learning, [Nearly] Everything you need to know in 2019 [link], keywords: Random Erasing Data Augmentation (2017), AutoAugment: Learning Augmentation Strategies from Data (CVPR 2019), Fast AutoAugment (2019), Learning Data Augmentation Strategies for Object Detection (2019), SpecAugment: for Automatic Speech Recognition (Interspeech 2019), EDA: for Boosting Performance on Text Classification Tasks (EMNLP-IJCNLP 2019), Unsupervised Data Augmentation for Consistency Training (2019)
  2. Data augmentation on entire dataset before splitting [link], conclusion: this practice is incorrect.
  3. How does data augmentation reduce overfitting? [link]

 

Data Augmentation for Regression Tasks:

Online articles that mentioned DA for regression tasks:

  1. Shehroz Khan's answer to What does the term data augmentation mean in the context of machine learning? [link]
  2. What you need to know about data augmentation for machine learning [link]
  3. Data augmentation techniques for general datasets? [link] (Teng: To me, it seems they were discussing feature engineering instead of adding more data points.)
  4. Data Augmentation Techniques for Cat/Binary/Continuous Numerical Dataset [link], keywords: SMOTE

 

 

Data Augmentation for Unbalanced Dataset in Classification Tasks:

  1. Oversampling and undersampling in data analysis [link]
  2. imbalanced-learn [GitHub] [docs]
  3. A collection of 85 minority oversampling techniques (SMOTE) for imbalanced learning with multi-class oversampling and model selection features [docs] [GitHub]
  4. A Deep Dive Into Imbalanced Data: Over-Sampling [link]
  5. SMOTE for high-dimensional class-imbalanced data, Rok Blagus and Lara Lusa, 2013 [link]
  6. SMOTE explained for noobs - Synthetic Minority Over-sampling TEchnique line by line [link]
  7. Detecting representative data and generating synthetic samples to improve learning accuracy with imbalanced data sets [link]
  8. ADASYN: Adaptive Synthetic Sampling Method for Imbalanced Data [link]

 

Data Augmentation for Image:

  1. Data Augmentation for Deep Learning [link], keywords: image augmentation packages, PyTorch framework
  2. 1000x Faster Data Augmentation [link], keywords: learn augmentation policies, Population Based Augmentation, Tune Framework
  3. A survey on Image Data Augmentation for Deep Learning, Connor Shorten and Taghi M. Khoshgoftaar [link]
  4. Python | Data Augmentation [link]
  5. How to Configure Image Data Augmentation in Keras [link]
  6. Data Augmentation | How to use Deep Learning when you have Limited Data -- Part 2 [link], keywords: online augmentation, offline augmentation
  7. Data augmentation for improving deep learning in image classification problem, Mikolajczyk et al. [link]
  8. The Effectiveness of Data Augmentation in Image Classification using Deep Learning, Jason Wang and Luis Perez [link]

 

Data Augmentation for Audio:

to be added ...

 

Data Augmentation for Texts:

  1. These are the Easiest Data Augmentation Techniques in Natural Language Processing you can think of -- and they work. Simple text editing techniques can make huge performance gains for small datasets. [link]

 

Data Augmentation for Time Series

  1. Data Augmentation strategies for Time Series Forecasting [link]

 

 

Academic Job Listings for Operations Management and Information Systems

OM/OR

  1. Operations Academia (link)
  2. Decision Science Institute Job Postings (link)
  3. POMS job openings (link)
  4. INFORMS Career Center (link) and mailing lists
  5. DMANET (link)
  6. ORNET(link)
  7. HigherEdJobs (link)
  8. AcademicJobsOnline (link)
  9. MathJobs (American Mathematical Society) (link)

IS

  1. Association for Information Systems Career Services (link) and mailing lists (link)

* By courtesy of Miao Bai, David Bergman, and Yuan Jin.

** Please let me know if I missed anything.

Data Visualization

Videos:

  • A Brief History of Data Visualization | Stanford [link]
  • History of Data Visualization and Telling a Story with Data | UC Berkeley Events [link]
  • 34 Data Visualization: A Brief History of Maps, Time Series, and Charts (FR) | Berkeley Initiative for Transparency in the Social Sciences (BITSS) [link]

Online Articles:

  • A brief history of data visualization | Jon Hazell [link]
  • 8 fantastic examples of data storytelling | import.io [link]
  • Randy Olson here to answer all of your questions about data visualization [link]
  • Spurious Correlations [link]

Books:

  • Tukey | Exploratory Data Analysis [amazon]
  • Robin Williams | The Non-Designer's Design Book (4th Edition) [amazon]
  • Gene Zelazny | Say It with Charts: The Executive's Guide to Visual Communication [amazon]