Cademy logoCademy Marketplace

Course Images

Spark Programming in Scala for Beginners with Apache Spark 3

Spark Programming in Scala for Beginners with Apache Spark 3

🔥 Limited Time Offer 🔥

Get a 10% discount on your first order when you use this promo code at checkout: MAY24BAN3X

  • 30 Day Money Back Guarantee
  • Completion Certificate
  • 24/7 Technical Support

Highlights

  • On-Demand course

  • 6 hours 47 minutes

  • All levels

Description

This course does not require any prior knowledge of Apache Spark or Hadoop. The author explains Spark architecture and fundamental concepts to help you come up to speed and grasp the content of this course. The course will help you understand Spark programming and apply that knowledge to build data engineering solutions.

Apache Spark is a lightning-fast unified analytics engine for big data and machine learning. Since its release, Apache Spark has seen rapid adoption by enterprises across a wide range of industries. Internet powerhouses such as Netflix, Yahoo, and eBay have deployed Spark at a massive scale. It has quickly become the largest open-source community in big data. So, mastering Apache Spark opens a wide range of professional opportunities. This course starts with an introduction to Apache Spark where you see what Apache Spark is in brief. Then, you will be installing and using Apache Spark. After that, you will look at the Spark execution model and architecture in detail. Next, you will learn the Spark programming model and developer experience. Following that, you will look at the Spark Structured API foundation, and Spark data sources and sinks. Then, you will explore Spark Data frame and dataset transformations along with aggregations in Apache Spark. Finally, you will look at the Spark Data frame joins in detail. By the end of this course, you will understand Spark programming and apply that knowledge to build data engineering solutions. All the resource files are uploaded on the GitHub repository at https://github.com/PacktPublishing/Spark-Programming-in-Scala-for-Beginners-with-Apache-Spark-3

What You Will Learn

Learn Apache Spark and Spark architecture
Look at data engineering and data processing in Spark
Work with data sources and sinks
Work with data frames, datasets, and Spark SQL
Use IntelliJ IDEA for Spark development and debugging
Understand unit testing, managing application logs, and cluster deployment

Audience

This course is designed for software engineers willing to develop a data engineering pipeline and application using Apache Spark. It is also for data architects and data engineers who are responsible for designing and building the organization's data-centric infrastructure. It will also be beneficial for the managers and architects who do not directly work with Spark implementation, and still, they work with the people who implement Apache Spark at the ground level.

Before proceeding with the course, you will need basic knowledge of the Scala programming language.

Approach

This is a hands-on and well-balanced course with theoretical and practical content. The course is example-driven and follows a working session-like approach. The author will be taking a live coding approach and explaining all the concepts needed along the way. The section will have a summary video at the end to revise the learned concepts.

Key Features

A comprehensive course designed for the beginner-level for Spark programming in Scala * Deep dive into Spark 3 architecture and data engineering * Complete tested source code and examples used on Apache Spark 3.0.0 open-source distribution from the author's end

Github Repo

https://github.com/PacktPublishing/Spark-Programming-in-Scala-for-Beginners-with-Apache-Spark-3

About the Author

Scholar Nest

ScholarNest is a small team of people passionate about helping others learn and grow in their careers by bridging the gap between their existing and required skills. Together, they have over 40+ years of experience in IT as a developer, architect, consultant, trainer, and mentor. They have worked with international software services organizations on various data-centric and Big Data projects. It is a team of firm believers in lifelong continuous learning and skill development. To popularize the importance of continuous learning, they started publishing free training videos on their YouTube channel. They conceptualized the notion of continuous learning, creating a journal of our learning under the Learning Journal banner.

Course Outline

1. Apache Spark Introduction


2. Installing and Using Apache Spark


3. Spark Execution Model and Architecture


4. Spark Programming Model and Developer Experience


5. Spark Structured API Foundation


6. Spark Data Sources and Sinks


7. Spark Dataframe and Dataset Transformations


8. Aggregations in Apache Spark


9. Spark Dataframe Joins


10. Keep Learning

Course Content

  1. Spark Programming in Scala for Beginners with Apache Spark 3

About The Provider

Packt
Packt
Birmingham
Founded in 2004 in Birmingham, UK, Packt’s mission is to help the world put software to work in new ways, through the delivery of effective learning and i...
Read more about Packt

Tags

Reviews