Machine Learning, 2026.

Last updated: 2026-03-25

Official Master’s Degree in Industrial Engineering + Master’s Degree in Smart Industry.
ICAI, Universidad Pontificia Comillas.

Sessions

2026-03-25 Session

  • The Arima session 4.4 has been update to a (more) complete version.
  • The assignment for the forecasting session will be available before this weekend. We will notify you when it is. Keep in mind that some of the requirements for the assignment will be covered in the first session after the Easter break.

2026-03-24 Session

  • Today we start the discussion of baseline models (4.3). The Arima session 4.4 has also been uploaded, albeit in a provisional version, since we will be updating it in the next sessions.

2026-03-18 Session

  • We ended the stationarity discussion.

2026-03-17 Session

  • Today we finished the 4_1 Introduction to forecasting and began the discussion of stationarity in 4_2.

2026-03-11 Session

  • Today we begin the forecasting section of the course.

2026-03-10 Session: Midterm (15:00-17:00, room 201)

  • The files for the midterm are expected to ba available by doing a regular git pull for the course repository. But in case of need they can be obteined as a password protected zip file at this link. The password will be provided at the beginning of the midterm session. Uncompress this file inside the repository folder MLMIIN, and you will find a folder called midterm26 with all the files you need to work on the midterm. Delete the midterm26.zip file after uncompressing, to avoid conflicts afterwards.

2026-02-25 Session (3h)

  • Midterm date:
    2026-03-26, 15:00-17:00. Location: room 201.

2026-02-24 Session (3h)

  • Today we finished the regularization part of 3_2 in the first hour and then in the remaining two hours we went through all of section 3_3, discussing more advanced scikit pipelines and then going quickly over nonlinear regression models.

  • Kaggle competition public results: original and clone

2026-02-18 Session (3h)

  • Today we finished the discussion of Linear Regression and we started talking about regularization and feature selection methods.
  • Code for sessions 3_1 and 3_2 has been updated.
    Make sure to git pull before resuming work. If you already made a local copy for 3_1, we suggest you make a second one from the updated version, to avoid any issues with the changes.

2026-02-17 Session (3h)

2026-02-11 Session

  • Boosting methods. And we began the discussion of Support Vector Machines.

2026-02-10 Session

  • Today we finish the decission trees session and we start talking about ensemble and boosting methods. We have finished the bagging and random forests part.
  • Pandas 3.0 has arrived (Release date: 2026-01-21). Here you will find a Medium post with a summary of some very relevant changes.
    Note: our Docker container uses a previous version of Pandas, so we will not need/be able to use the new features in the course sessions; but we encourage you to look at the changes anyway, since Pandas ia a major component of the python Data Science stack, and the new version is a major update.
  • We will also introduce the first assignment of the course.

2026-02-04 Session

  • This session dealt with decission trees.

2026-02-03 Session

  • Today we finished the discussion of KNN models and basic validation methods.
  • Updated code for exercises

2026-01-28 Session

  • Today we continued the discussion of KNN models; we are using this model to introduce many foundational ideas of Machine Learning.

2026-01-27 Session

  • Course schedule proposal:

    For the two weeks preceding the midterms we will have three hour sessions:
    • Tuesdays 17th and 24th February, 15:00-18:00
    • Wednesday 18th and 25th February, 14:00-17:00
  • Today we finished the Logistic Regression and started the (very preliminary) discussion of KNN.

2026-01-21 Session

We will have our first (mock) course assignment. Please make sure that we know your email and associated github user beforehand.

  • Talk about quarto rendering.

We have been working on 2_2_Classification_Logistic_Regression and we have stopped at the paragraph with the question “What Remains to Be Done in this Model?” (before Prediction and Model Performance Measures in Classification).

2026-01-20 Session

We have finished 2_1_Classif_EDA_Preprocessing notebook and we have started 2_2_Classification_Logistic_Regression, discussing the signal vs noise idea behind Logistic Regression in the setting of a one-numeric- input datasets.

  • Remember to discuss course schedule!

  • Talk about gitignore.

  • Talk about quarto rendering.

2026-01-14 Session

  • Remember to discuss course schedule!

2026-01-13 Session

  • Welcome to Machine Learning 2026!

Docker run commands

Mac OS

docker run -it --rm -p 8888:8888 -v "$PWD":/wd mlmiin/mlmiin:2026V01

To mount the exclude folder

Add a second mount point like this to the command. This is needed because Docker does not follow symbolic links on Mac OS or Linux. The command is easily adaptable to mount any other folder under Windows as well, just change the path format accordingly.

docker run -it --rm -p 8888:8888 -v "$PWD":/wd  -v "$PWD"/exclude:/wd/exclude mlmiin/mlmiin:2026V01

Windows

docker run -it --rm -p 8888:8888 -v "$($PWD.Path):/wd" mlmiin/mlmiin:2026V01

Session Notes (html files)

Classification Sessions

Regression Sessions

Forecasting Sessions

Unsupervised Learning Sessions

  • 5_1_Unsupervised_PCA
  • 5_2_Unsupervised_Clustering
  • 5_3_Density_Estimation

References

Géron, A. (2022). Hands-on machine learning with scikit-learn, keras, and TensorFlow (3rd ed.). Sebastopol, CA: O’Reilly Media. Retrieved from https://www.oreilly.com/library/view/hands-on-machine-learning/9781098125967/
Glassner, A. (2021). Deep learning: A visual approach. No Starch Press. Retrieved from https://nostarch.com/deep-learning-visual-approach
Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed., p. xxii+745). New York, NY, USA: Springer Science+Business Media. https://doi.org/10.1007/978-0-387-84858-7
Hyndman, R. J., Athanasopoulos, G., Garza, A., Challu, C., Mergenthaler Canseco, M., & Olivares, K. G. (2025). Forecasting: Principles and practice, the pythonic way. Retrieved from https://otexts.com/fpppy/
James, G., Witten, D., Hastie, T., Tibshirani, R., & Taylor, J. (2023). An introduction to statistical learning with applications in python. Springer International Publishing. https://doi.org/10.1007/978-3-031-38747-0
Kuhn, M., & Johnson, K. (2013). Applied predictive modeling. New York, NY: Springer. Retrieved from https://link.springer.com/book/10.1007/978-1-4614-6849-3
Murphy, K. P. (2022). Probabilistic machine learning: An introduction. MIT Press. Retrieved from https://probml.github.io/pml-book/
Peixeiro, M. (2022). Time series forecasting in python. Manning. Retrieved from https://www.manning.com/books/time-series-forecasting-in-python-book
Raschka, S., Liu, Y. (Hayden), & Mirjalili, V. (2022). Machine learning with PyTorch and scikit-learn: Develop machine learning and deep learning models with python. Packt Publishing. Retrieved from https://www.packtpub.com/product/machine-learning-with-pytorch-and-scikit-learn/9781801819312
Serrano, L. G. (2021). Grokking machine learning (p. 512). Shelter Island, NY, USA: Manning Publications. Retrieved from https://www.manning.com/books/grokking-machine-learning
VanderPlas, J. (2016). Python data science handbook: Essential tools for working with data. O’Reilly. Retrieved from https://jakevdp.github.io/PythonDataScienceHandbook/