UCLA Stats 404 - Statistical Computing and Programming

Course Overview (Python) and Syllabus and GitHub repository

  • Week 1: Business use case, setting up reproducible machine learning environment, introduction to Git
    • Installation instructions for Python and Git ahead of first lecture
    • HW 1 (due January 15)
  • Weeks 2 and 3: Introduction to Python, pandas and SQL
    • Python: expressions, control flow, functions, variable types, passing by reference, list comprehension, functional programming
    • pandas: reading-in data, subsetting, EDA, split + apply+ combine, pandas + databases
    • Extra Credit 1 (due January 22)
    • HW 2 (due February 5)
    • Lab 1 (due February 5)
    • Extra Credit 2 (due February 5)
  • Weeks 4 and 5: Regression methods + numerical optimization + loss functions
    • linear, logistic, Elastic nets, PCA regression, hyper-parameter tuning, Deep Learning and custom loss functions
    • Lab 2 (in-class, due February 5)
    • HW 3 (due February 12)
  • Week 6: Python and Big Data
    • pandas and big data, Dask, pySpark + SparkSQL, embarrassingly parallel processes, AWS S3
    • HW 4 (due February 19)
    • Extra Credit (due February 19)
  • Week 7 and 8: Introduction to Software Development
    • reproducibility, readability, robustness
    • testing suite, ML test, typing, model roll-out
  • Weeks 9 and 10: Final Project Presentations
  • Week 11: Final's Week