Morning:
Intro to Spark + DataFrames
Lunch: 12 - 1 pm
Afternoon:
Built-in Functions
User Defined Functions (UDFs)
Caching + Partitioning
Morning:
Data Cleansing & EDA
Linear Regression
Lunch: 12 - 1 pm
Afternoon:
Transformer, Estimator, Pipeline API
MLflow Tracking
MLflow Model Registry
Morning:
Decision Trees
Model Tuning, Cross-Validation, and Grid Search
MLlib Deployment Options
Lunch: 12 - 1 pm
Afternoon:
XGBoost & 3rd Party Libraries
Pandas UDFs & Koalas
Capstone Project & Course Recap
RDDs, DataFrames, Datasets
When/where to use Spark and SparkML
Track, version, and deploy models with MLflow
Use Spark to scale the inference or hyperparameter tuning of single-node models
Types of common ML problems and gotchas
Spark before?
Machine Learning?
Language: Python? Scala?
Name + Responsibilities
Interests/Fun fact