The main purpose of the course is to give students the ability plan and implement big data workflows on HDInsight.
- 3 months access to e-learning materials
- 3 months access to training labs
PrerequisitesIn addition to their professional experience, students who attend this course should have:
- Programming experience using R, and familiarity with common R packages
- Knowledge of common statistical methods and data analysis best practices.
- Basic knowledge of the Microsoft Windows operating system and its core functionality.
- Working knowledge of relational databases.
Course outcomeAfter completing this course, students will be able to:
- Deploy HDInsight Clusters.
- Authorizing Users to Access Resources.
- Loading Data into HDInsight.
- Troubleshooting HDInsight.
- Implement Batch Solutions.
- Design Batch ETL Solutions for Big Data with Spark
- Analyze Data with Spark SQL.
- Analyze Data with Hive and Phoenix.
- Describe Stream Analytics.
- Implement Spark Streaming Using the DStream API.
- Develop Big Data Real-Time Processing Solutions with Apache Storm.
- Build Solutions that use Kafka and HBase.
Who should attendThe primary audience for this course is data engineers, data architects, data scientists, and data developers who plan to implement big data engineering workflows on HDInsight.
- Module 1: Getting Started with HDInsight
- Lab : Working with HDInsight
- Module 2: Deploying HDInsight Clusters
- Lab : Managing HDInsight clusters with the Azure Portal
- Module 3: Authorizing Users to Access Resources
- Lab : Authorizing Users to Access Resources
- Module 4: Loading data into HDInsight
- Lab : Loading Data into your Azure account
- Module 5: Troubleshooting HDInsight
- Lab : Troubleshooting HDInsight
- Module 6: Implementing Batch Solutions
- Lab : Implement Batch Solutions
- Module 7: Design Batch ETL solutions for big data with Spark
- Lab : Design Batch ETL solutions for big data with Spark.
- Module 8: Analyze Data with Spark SQL
- Lab : Performing exploratory data analysis by using iterative and interactive queries
- Module 9: Analyze Data with Hive and Phoenix
- Lab : Analyze data with Hive and Phoenix
- Module 10: Stream Analytics
- Lab : Implement Stream Analytics
- Module 11: Implementing Streaming Solutions with Kafka and HBase
- Lab : Implementing Streaming Solutions with Kafka and HBase
- Module 12: Develop big data real-time processing solutions with Apache Storm
- Lab : Developing big data real-time processing solutions with Apache Storm
- Module 13: Create Spark Streaming Applications
- Lab : Building a Spark Streaming Application