Data Engineering Foundations Specialization
- Offered byCoursera
Data Engineering Foundations Specialization at Coursera Overview
Duration | 5 months |
Start from | Start Now |
Mode of learning | Online |
Schedule type | Self paced |
Difficulty level | Beginner |
Official Website | Go to Website |
Credential | Certificate |
Data Engineering Foundations Specialization at Coursera Highlights
- Earn a certificate of completion from IBM
- Gain an expertise on widely used skills like Information Engineering, Python Programming, SQL, Data Science
Data Engineering Foundations Specialization at Coursera Course details
- Working knowledge of Data Engineering Ecosystem and Lifecycle. Viewpoints and tips from Data professionals on starting a career in this domain.
- Python programming basics including data structures, logic, working with files, invoking APIs, using libraries such as Pandas and Numpy, doing ETL.
- Relational Database fundamentals including Database Design, Creating Schemas, Tables, Constraints, and working with MySQL, PostgreSQL & IBM Db2.
- SQL query language, SELECT, INSERT, UPDATE, DELETE statements, database functions, stored procs, working with multiple tables, JOINs, & transactions.
- Data engineering is one of the fastest-growing tech occupations, where the demand for skilled data engineers far outweighs the supply. The goal of data engineering is to make quality data available for fact-finding and data-driven decision making. This Specialization from IBM will help anyone interested in pursuing a career in data engineering by teaching fundamental skills to get started in this field. No prior data engineering experience is required to succeed in this Specialization.
- The Specialization consists of 5 self-paced online courses covering skills required for data engineering, including the data engineering ecosystem and lifecycle, Python, SQL, and Relational Databases. You will learn these data engineering prerequisites through engaging videos and hands-on practice using real tools and real-world databases. You'll develop your understanding of data engineering, gain skills that can be applied directly to a data career, and build the foundation of your data engineering career.
- Upon successfully completing these courses, you will have the practical knowledge and experience to delve deeper into data engineering and work on more advanced data engineering projects.
Data Engineering Foundations Specialization at Coursera Curriculum
Course 1 - Introduction to Data Engineering
This course introduces you to the core concepts, processes, and tools you need to know in order to get a foundational knowledge of data engineering. You will gain an understanding of the modern data ecosystem and the role Data Engineers, Data Scientists, and Data Analysts play in this ecosystem.
The Data Engineering Ecosystem includes several different components. It includes disparate data types, formats, and sources of data. Data Pipelines gather data from multiple sources, transform it into analytics-ready data, and make it available to data consumers for analytics and decision-making. Data repositories, such as relational and non-relational databases, data warehouses, data marts, data lakes, and big data stores process and store this data. Data Integration Platforms combine disparate data into a unified view for the data consumers. You will learn about each of these components in this course. You will also learn about Big Data and the use of some of the Big Data processing tools.
A typical Data Engineering lifecycle includes architecting data platforms, designing data stores, and gathering, importing, wrangling, querying, and analyzing data. It also includes performance monitoring and finetuning to ensure systems are performing at optimal levels. In this course, you will learn about the data engineering lifecycle. You will also learn about security, governance, and compliance.
Data Engineering is recognized as one of the fastest-growing fields today. The career opportunities available in the field and the different paths you can take to enter this field are discussed in the course.
The course also includes hands-on labs that guide you to create your IBM Cloud Lite account, provision a database instance, load data into the database instance, and perform some basic querying operations that help you understand your dataset.
Course 2 - Python for Data Science, AI & Development
Kickstart your learning of Python for data science, as well as programming in general, with this beginner-friendly introduction to Python. Python is one of the world's most popular programming languages, and there has never been greater demand for professionals with the ability to apply Python fundamentals to drive business solutions across industries.
This course will take you from zero to programming in Python in a matter of hours'no prior programming experience necessary! You will learn Python fundamentals, including data structures and data analysis, complete hands-on exercises throughout the course modules, and create a final project to demonstrate your new skills.
By the end of this course, you'll feel comfortable creating basic programs, working with data, and solving real-world problems in Python. You'll gain a strong foundation for more advanced learning in the field, and develop skills to help advance your career.
This course can be applied to multiple Specialization or Professional Certificate programs. Completing this course will count towards your learning in any of the following programs:
IBM Applied AI Professional Certificate
Applied Data Science Specialization
IBM Data Science Professional Certificate
Upon completion of any of the above programs, in addition to earning a Specialization completion certificate from Coursera, you'll also receive a digital badge from IBM recognizing your expertise in the field.
Course 3 - Python Project for Data Engineering
This mini-course is intended to apply foundational Python skills by implementing different techniques to collect and work with data. Assume the role of a Data Engineer and extract data from multiple file formats, transform it into specific datatypes, and then load it into a single source for analysis. Continue with the course and test your knowledge by implementing webscraping and extracting data with APIs all with the help of multiple hands-on labs. After completing this course you will have acquired the confidence to begin collecting large datasets from multiple sources and transform them into one primary source, or begin web scraping to gain valuable business insights all with the use of Python.
PRE-REQUISITE: **Python for Data Science, AI and Development** course from IBM is a pre-requisite for this project course. Please ensure that before taking this course you have either completed the Python for Data Science, AI and Development course from IBM or have equivalent proficiency in working with Python and data.
Course 4 - Introduction to Relational Databases (RDBMS)
Are you ready to dive into the world of data engineering? You'll need a solid understanding of how data is stored, processed, and accessed. You'll need to identify the different types of database that are appropriate for the kind of data you are working with and what processing the data requires.
In this course, you will learn the essential concepts behind relational databases and Relational Database Management Systems (RDBMS). You'll study relational data models and discover how they are created and what benefits they bring, and how you can apply them to your own data. You'll be introduced to several industry standard relational databases, including IBM DB2, MySQL, and PostgreSQL.
This course incorporates hands-on, practical exercises to help you demonstrate your learning. You will work with real databases and explore real-world datasets. You will create database instances and populate them with tables.
No prior knowledge of databases or programming is required.
Course 5 - Databases and SQL for Data Science with Python
Much of the world's data resides in databases. SQL (or Structured Query Language) is a powerful language which is used for communicating with and extracting data from databases. A working knowledge of databases and SQL is a must if you want to become a data scientist.
The purpose of this course is to introduce relational database concepts and help you learn and apply foundational knowledge of the SQL language. It is also intended to get you started with performing SQL access in a data science environment.
The emphasis in this course is on hands-on and practical learning . As such, you will work with real databases, real data science tools, and real-world datasets. You will create a database instance in the cloud. Through a series of hands-on labs you will practice building and running SQL queries. You will also learn how to access databases from Jupyter notebooks using SQL and Python.
No prior knowledge of databases, SQL, Python, or programming is required.
Anyone can audit this course at no-charge. If you choose to take this course and earn the Coursera course certificate, you can also earn an IBM digital badge upon successful completion of the course.
LIMITED TIME OFFER: Subscription is only $39 USD per month for access to graded materials and a certificate.
Anyone can audit this course at no-charge. If you choose to take this course and earn the Coursera course certificate, you can also earn an IBM digital badge upon successful completion of the course.