Big Data Integration and Processing
- Offered byCoursera
Big Data Integration and Processing at Coursera Overview
Duration | 18 hours |
Start from | Start Now |
Total fee | Free |
Mode of learning | Online |
Difficulty level | Beginner |
Official Website | Explore Free Course |
Credential | Certificate |
Big Data Integration and Processing at Coursera Highlights
- 41%
- started a new career after completing these courses.
- 35%
- got a tangible career benefit from this course.
- 20%
- got a pay increase or promotion.
Big Data Integration and Processing at Coursera Course details
- At the end of the course, you will be able to:
- *Retrieve data from example database and big data management systems
- *Describe the connections between data management operations and the big data processing patterns needed to utilize them in large-scale analytical applications
- *Identify when a big data problem needs data integration
- *Execute simple big data integration and processing on Hadoop and Spark platforms
- This course is for those new to data science. Completion of Intro to Big Data is recommended. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments. Refer to the specialization technical requirements for complete hardware and software specifications.
- Hardware Requirements:
- (A) Quad Core Processor (VT-x or AMD-V support recommended), 64-bit; (B) 8 GB RAM; (C) 20 GB disk free. How to find your hardware information: (Windows): Open System by clicking the Start button, right-clicking Computer, and then clicking Properties; (Mac): Open Overview by clicking on the Apple menu and clicking 'About This Mac.-Most computers with 8 GB RAM purchased in the last 3 years will meet the minimum requirements.You will need a high speed internet connection because you will be downloading files up to 4 Gb in size.
- Software Requirements:
- This course relies on several open-source software tools, including Apache Hadoop. All required software can be downloaded and installed free of charge (except for data charges from your internet provider). Software requirements include: Windows 7+, Mac OS X 10.10+, Ubuntu 14.04+ or CentOS 6+ VirtualBox 5+.
Big Data Integration and Processing at Coursera Curriculum
Welcome to Big Data Integration and Processing
What is in this Course?
Summary of Big Data Modeling and Management
Why is Big Data Processing Different?
Slides: Summary & Why Is Big Data Processing Different
Downloading and Installing the Cloudera VM Instructions (Windows)
Downloading and Installing the Cloudera VM Instructions (Mac)
Software Installation Frequently Asked Questions (FAQ)
Instructions for Downloading Hands On Datasets
Instructions for Starting Jupyter
What is Data Retrieval? Part 1
What is Data Retrieval? Part 2
Querying Two Relations
Subqueries
Querying Relational Data with Postgres
Slides: What is Data Retrieval?
Querying Relational Data with Postgres
Retrieving Big Data (Part 2)
Querying JSON Data with MongoDB
Aggregation Functions
Querying Aerospike
Querying Documents in MongoDB
Exploring Pandas DataFrames
Slides: Querying Data Part 2
Querying Documents in MongoDB
Exploring Pandas DataFrames
Retrieving Big Data Quiz
Postgres, MongoDB, and Pandas
Big Data Integration
Overview of Information Integration
A Data Integration Scenario
Integration for Multichannel Customer Analytics
Big Data Management and Processing Using Splunk and Datameer
Why Splunk?
Connected Cars with Ford's OpenXC and Splunk
Big Data Management and Processing using Datameer
Installing Splunk Enterprise on Windows
Installing Splunk Enterprise on Linux
Exploring Splunk Queries
Optional: Creating Pivot Reports in Splunk
Slides: Information Integration
Downloading Splunk Enterprise
Exploring Splunk Queries
Optional: Instructions for Splunk Pivot Tutorial
Information Integration - Quiz
Hands-On With Splunk
Processing Big Data
Big Data Processing Pipelines
Some High-Level Processing Operations in Big Data Pipelines
Aggregation Operations in Big Data Pipelines
Typical Analytical Operations in Big Data Pipelines
Overview of Big Data Processing Systems
The Integration and Processing Layer
Introduction to Apache Spark
Getting Started with Spark
WordCount in Spark
Big Data Processing Pipelines Slides
Big Data Workflow Management
Slides for Big Data Processing Tools and Systems
WordCount in Spark
Pipeline and Tools
WordCount in Spark
Big Data Analytics using Spark
Spark Core: Programming In Spark using RDDs in Pipelines
Spark Core: Transformations
Spark Core: Actions
Spark SQL
Spark Streaming
Spark MLLib
Spark GraphX
Exploring SparkSQL and Spark DataFrames
Analyzing Sensor Data with Spark Streaming
Slides for Module 5 Lesson 1
Slides for Module 5 Lesson 2
Exploring SparkSQL and Spark DataFrames
Instructions for Configuring VirtualBox for Spark Streaming
Analyzing Sensor Data with Spark Streaming
More on Spark
SparkSQL and Spark Streaming
Learn By Doing: Putting MongoDB and Spark to Work
Let's Analyze Soccer Tweets!
Expressing Analytical Questions as MongoDB Queries
Exporting Data from MongoDB to a CSV File
Analyzing Tweets About Countries
Check Your Query Results
Check Your Analysis Results