DANL 320 - Big Data Analytics

Credit(s): 3
Lecture: 3
Non-Lecture: 0

This course teaches you how to analyze big data sets, and this course is specifically designed to bring you up to speed on one of the best technologies for this task, Apache Spark! The top technology companies like Google, Facebook, Netflix, Airbnb, 3 Amazon, and many more are all using Spark to solve their big data problems! With the Spark 3.0 DataFrame framework, it can perform up to 100x faster than Hadoop MapReduce. This course will review the basics in Python, continuing on to learning how to use Spark DataFrame API with the latest Spark 3.0 syntax! In addition, you will learn how to perform supervised an unsupervised machine learning on massive datasets using the Machine Learning Library (MLlib). You will also gain hands-on experience using PySpark within the Jupyter Notebook environment. This course also covers the latest Spark technologies, like Spark SQL, Spark Streaming, and advanced data analytics modeling methodologies.

Prerequisite(s): DANL 300
Class Restriction: Junior, Senior