Using deep learning to analyze data

Sep 7 2023

I’ve been hacking away at various deep learning models for the past few months. Here are my notes, starting with the basics.

Deep Learning and Analysis

Deep learning usually means “machine learning but with neural networks.” I’ve been focused on analyzing large amounts of sequential data (e.g., time series data). The word “analysis” does a lot of work in the preceding sentence. When people say they want to “analyze” data, they usually mean at least one of the following:

Classification. I want to identify groups out of a cluster of data, e.g., “people with 2 kids and a dog” versus “single people”.
Regression. Predict target variables from feature variables, e.g., “how much energy will my house use if it’s 99 degrees outside with a relative humidity of 72%?”
Forecasting. Predict future values from past values, of the same variable, e.g., “how much energy will my house use in August based on historical consumption patterns?”
Reporting. (I’m not sure why you might want a NN to do reporting though; dashboards seem more useful here.)

The Analysis Ecosystem

There are lots of libraries to do lots of different things in ML. When we’re looking at the above analysis problems, I’ve found the following.

Forecasting: Darts has very accessible authors & solid documentation. Pytorch-Forecasting also exists but does not seem to be well-maintained.
Regression & Classification: Sktime seems to be the most popular. The estimator support in sktime, however, is fairly limited.

Python Packaging

Packaging is a hard problem, and Python seems to suffer more than most from the challenges of good packaging. I’ve historically been a virtualenv kind of developer. However, many of the wheels that I use are not easily available, so I’ve ended up using conda instead. YMMV. (Note that I have an M2, which doesn’t help things.)

Reading

I had previously written about LLMs. Here are some more things that I’ve found useful since then. Note that my focus area has moved beyond LLMs, so there isn’t anything LLM-related here.

Development Notes

If you’re trying to build your own NN, I’d start simple and then layer in complexity (duh). I started here with Neural Regression Using PyTorch: Defining a Network, which was very well written and the best of the tutorials that I explored.

These days, besides conda, I rely heavily on Jupyter Notebooks running in VSCode for exploring different models.

More fundamentals

I’ve found that while you can treat ML models as a black box, it’s super-helpful to have some intuition of how they work when you inevitably run into problems.

Forecasting: Principles and Practice provides an excellent overview of all the common methodologies for forecasting, with an emphasis on classical (non ML) methods.
Behavior Analysis with Machine Learning Using R covers various approaches to applying ML to data (not just forecasting).
The Illustrated Transformer provides a very good overview of transformers.
The https://en.wikipedia.org/wiki/Makridakis_Competitions is the premier competition for time series forecasting. In 2020, Uber surprised the world with a hybrid exponential smoothing / RNN model that won the competition by a significnat margin.
M5 Competition Results