The main purpose of the course is to give students the ability to use Microsoft R Server to create and run an analysis on a large dataset, and show how to utilize it in Big Data environments, such as a Hadoop or Spark cluster, or a SQL Server database.
- 3 months access to e-learning materials
- 3 months access to training labs
PrerequisitesIn addition to their professional experience, students who attend this course should have:
- Programming experience using R, and familiarity with common R packages
- Knowledge of common statistical methods and data analysis best practices.
- Basic knowledge of the Microsoft Windows operating system and its core functionality.
- Working knowledge of relational databases.
Course outcomeAfter completing this course, students will be able to:
- Explain how Microsoft R Server and Microsoft R Client work
- Use R Client with R Server to explore big data held in different data stores
- Visualize data by using graphs and plots
- Transform and clean big data sets
- Implement options for splitting analysis jobs into parallel tasks
- Build and evaluate regression models generated from big data
- Create, score, and deploy partitioning models generated from big data
- Use R in the SQL Server and Hadoop environments
Who should attendThe primary audience for this course is people who wish to analyze large datasets within a big data environment. The secondary audience are developers who need to integrate R analyses into their solutions.
- Module 1: Microsoft R Server and R Client
- Lab : Exploring Microsoft R Server and Microsoft R Client
- Module 2: Exploring Big Data
- Lab : Exploring Big Data
- Module 3: Visualizing Big Data
- Lab : Visualizing data
- Module 4: Processing Big Data
- Lab : Processing big data
- Module 5: Parallelizing Analysis Operations
- Lab : Using rxExec and RevoPemaR to parallelize operations
- Module 6: Creating and Evaluating Regression Models
- Lab : Creating a linear regression model
- Module 7: Creating and Evaluating Partitioning Models
- Lab : Creating and evaluating partitioning models
- Module 8: Processing Big Data in SQL Server and Hadoop
- Lab : Processing big data in SQL Server and Hadoop