• Professional Development
  • Medicine & Nursing
  • Arts & Crafts
  • Health & Wellbeing
  • Personal Development

24 Apache Spark courses delivered On Demand

AWS CloudFormation Master Class

By Packt

With this course, you will master all CloudFormation concepts, and become confident in writing CloudFormation templates using YAML. Throughout the course, you will encounter various interesting examples and activities that will help you to consolidate your learning.

AWS CloudFormation Master Class
Delivered Online On Demand
£29.99

The Ultimate Hands-On Hadoop

By Packt

This course will show you why Hadoop is one of the best tools to work with big data. With the help of some real-world data sets, you will learn how to use Hadoop and its distributed technologies, such as Spark, Flink, Pig, and Flume, to store, analyze, and scale big data.

The Ultimate Hands-On Hadoop
Delivered Online On Demand
£134.99

SQL NoSQL Big Data and Hadoop

4.7(160)

By Janets

Register on the SQL NoSQL Big Data and Hadoop today and build the experience, skills and knowledge you need to enhance your professional development and work towards your dream job. Study this course through online learning and take the first steps towards a long-term career. The course consists of a number of easy to digest, in-depth modules, designed to provide you with a detailed, expert level of knowledge. Learn through a mixture of instructional video lessons and online study materials. Receive online tutor support as you study the course, to ensure you are supported every step of the way. Get a digital certificate as a proof of your course completion. The SQL NoSQL Big Data and Hadoop is incredibly great value and allows you to study at your own pace. Access the course modules from any internet-enabled device, including computers, tablet, and smartphones. The course is designed to increase your employability and equip you with everything you need to be a success. Enrol on the now and start learning instantly! WHAT YOU GET WITH THE SQL NOSQL BIG DATA AND HADOOP * Receive a e-certificate upon successful completion of the course * Get taught by experienced, professional instructors * Study at a time and pace that suits your learning style * Get instant feedback on assessments  * 24/7 help and advice via email or live chat * Get full tutor support on weekdays (Monday to Friday) COURSE DESIGN The course is delivered through our online learning platform, accessible through any internet-connected device. There are no formal deadlines or teaching schedules, meaning you are free to study the course at your own pace. You are taught through a combination of * Video lessons * Online study materials CERTIFICATION Upon successful completion of the course, you will be able to obtain your course completion e-certificate free of cost. Print copy by post is also available at an additional cost of £9.99 and PDF Certificate at £4.99. WHO IS THIS COURSE FOR: The course is ideal for those who already work in this sector or are an aspiring professional. This course is designed to enhance your expertise and boost your CV. Learn key skills and gain a professional qualification to prove your newly-acquired knowledge. REQUIREMENTS: The online training is open to all students and has no formal entry requirements. To study the SQL NoSQL Big Data and Hadoop, all your need is a passion for learning, a good understanding of English, numeracy, and IT skills. You must also be over the age of 16. COURSE CONTENT Section 01: Introduction Introduction 00:07:00 Building a Data-driven Organization - Introduction 00:04:00 Data Engineering 00:06:00 Learning Environment & Course Material 00:04:00 Movielens Dataset 00:03:00 Section 02: Relational Database Systems Introduction to Relational Databases 00:09:00 SQL 00:05:00 Movielens Relational Model 00:15:00 Movielens Relational Model: Normalization vs Denormalization 00:16:00 MySQL 00:05:00 Movielens in MySQL: Database import 00:06:00 OLTP in RDBMS: CRUD Applications 00:17:00 Indexes 00:16:00 Data Warehousing 00:15:00 Analytical Processing 00:17:00 Transaction Logs 00:06:00 Relational Databases - Wrap Up 00:03:00 Section 03: Database Classification Distributed Databases 00:07:00 CAP Theorem 00:10:00 BASE 00:07:00 Other Classifications 00:07:00 Section 04: Key-Value Store Introduction to KV Stores 00:02:00 Redis 00:04:00 Install Redis 00:07:00 Time Complexity of Algorithm 00:05:00 Data Structures in Redis : Key & String 00:20:00 Data Structures in Redis II : Hash & List 00:18:00 Data structures in Redis III : Set & Sorted Set 00:21:00 Data structures in Redis IV : Geo & HyperLogLog 00:11:00 Data structures in Redis V : Pubsub & Transaction 00:08:00 Modelling Movielens in Redis 00:11:00 Redis Example in Application 00:29:00 KV Stores: Wrap Up 00:02:00 Section 05: Document-Oriented Databases Introduction to Document-Oriented Databases 00:05:00 MongoDB 00:04:00 MongoDB Installation 00:02:00 Movielens in MongoDB 00:13:00 Movielens in MongoDB: Normalization vs Denormalization 00:11:00 Movielens in MongoDB: Implementation 00:10:00 CRUD Operations in MongoDB 00:13:00 Indexes 00:16:00 MongoDB Aggregation Query - MapReduce function 00:09:00 MongoDB Aggregation Query - Aggregation Framework 00:16:00 Demo: MySQL vs MongoDB. Modeling with Spark 00:02:00 Document Stores: Wrap Up 00:03:00 Section 06: Search Engines Introduction to Search Engine Stores 00:05:00 Elasticsearch 00:09:00 Basic Terms Concepts and Description 00:13:00 Movielens in Elastisearch 00:12:00 CRUD in Elasticsearch 00:15:00 Search Queries in Elasticsearch 00:23:00 Aggregation Queries in Elasticsearch 00:23:00 The Elastic Stack (ELK) 00:12:00 Use case: UFO Sighting in ElasticSearch 00:29:00 Search Engines: Wrap Up 00:04:00 Section 07: Wide Column Store Introduction to Columnar databases 00:06:00 HBase 00:07:00 HBase Architecture 00:09:00 HBase Installation 00:09:00 Apache Zookeeper 00:06:00 Movielens Data in HBase 00:17:00 Performing CRUD in HBase 00:24:00 SQL on HBase - Apache Phoenix 00:14:00 SQL on HBase - Apache Phoenix - Movielens 00:10:00 Demo : GeoLife GPS Trajectories 00:02:00 Wide Column Store: Wrap Up 00:04:00 Section 08: Time Series Databases Introduction to Time Series 00:09:00 InfluxDB 00:03:00 InfluxDB Installation 00:07:00 InfluxDB Data Model 00:07:00 Data manipulation in InfluxDB 00:17:00 TICK Stack I 00:12:00 TICK Stack II 00:23:00 Time Series Databases: Wrap Up 00:04:00 Section 09: Graph Databases Introduction to Graph Databases 00:05:00 Modelling in Graph 00:14:00 Modelling Movielens as a Graph 00:10:00 Neo4J 00:04:00 Neo4J installation 00:08:00 Cypher 00:12:00 Cypher II 00:19:00 Movielens in Neo4J: Data Import 00:17:00 Movielens in Neo4J: Spring Application 00:12:00 Data Analysis in Graph Databases 00:05:00 Examples of Graph Algorithms in Neo4J 00:18:00 Graph Databases: Wrap Up 00:07:00 Section 10: Hadoop Platform Introduction to Big Data With Apache Hadoop 00:06:00 Big Data Storage in Hadoop (HDFS) 00:16:00 Big Data Processing : YARN 00:11:00 Installation 00:13:00 Data Processing in Hadoop (MapReduce) 00:14:00 Examples in MapReduce 00:25:00 Data Processing in Hadoop (Pig) 00:12:00 Examples in Pig 00:21:00 Data Processing in Hadoop (Spark) 00:23:00 Examples in Spark 00:23:00 Data Analytics with Apache Spark 00:09:00 Data Compression 00:06:00 Data serialization and storage formats 00:20:00 Hadoop: Wrap Up 00:07:00 Section 11: Big Data SQL Engines Introduction Big Data SQL Engines 00:03:00 Apache Hive 00:10:00 Apache Hive : Demonstration 00:20:00 MPP SQL-on-Hadoop: Introduction 00:03:00 Impala 00:06:00 Impala : Demonstration 00:18:00 PrestoDB 00:13:00 PrestoDB : Demonstration 00:14:00 SQL-on-Hadoop: Wrap Up 00:02:00 Section 12: Distributed Commit Log Data Architectures 00:05:00 Introduction to Distributed Commit Logs 00:07:00 Apache Kafka 00:03:00 Confluent Platform Installation 00:10:00 Data Modeling in Kafka I 00:13:00 Data Modeling in Kafka II 00:15:00 Data Generation for Testing 00:09:00 Use case: Toll fee Collection 00:04:00 Stream processing 00:11:00 Stream Processing II with Stream + Connect APIs 00:19:00 Example: Kafka Streams 00:15:00 KSQL : Streaming Processing in SQL 00:04:00 KSQL: Example 00:14:00 Demonstration: NYC Taxi and Fares 00:01:00 Streaming: Wrap Up 00:02:00 Section 13: Summary Database Polyglot 00:04:00 Extending your knowledge 00:08:00 Data Visualization 00:11:00 Building a Data-driven Organization - Conclusion 00:07:00 Conclusion 00:03:00 Resources Resources - SQL NoSQL Big Data And Hadoop 00:00:00

SQL NoSQL Big Data and Hadoop
Delivered Online On Demand
£25

Apache Kafka Series - Learn Apache Kafka for Beginners v3

By Packt

A beginner-level course that follows a step-by-step approach to learning the fundamentals and core concepts of Apache Kafka 3.0. You will work through interesting activities such as programming a Twitter producer and Elasticsearch consumer to understand the various concepts.

Apache Kafka Series - Learn Apache Kafka for Beginners v3
Delivered Online On Demand
£35.99

Online Options

Show all 36

Spark Programming in Scala for Beginners with Apache Spark 3

By Packt

This course does not require any prior knowledge of Apache Spark or Hadoop. The author explains Spark architecture and fundamental concepts to help you come up to speed and grasp the content of this course. The course will help you understand Spark programming and apply that knowledge to build data engineering solutions.

Spark Programming in Scala for Beginners with Apache Spark 3
Delivered Online On Demand
£14.99

Apache Spark 3 Advance Skills for Cracking Job Interviews

By Packt

A carefully structured advanced-level course on Apache Spark 3 to help you clear your job interviews. This course covers advanced topics and concepts that are part of the Databricks Spark certification exam. Boost your skills in Spark 3 architecture and memory management.

Apache Spark 3 Advance Skills for Cracking Job Interviews
Delivered Online On Demand
£67.99

Real-Time Stream Processing Using Apache Spark 3 for Scala Developers

By Packt

Learn the process to design and develop big data engineering projects using Apache Spark. This example-driven advanced-level course will help you understand real-time stream processing using Apache Spark and you can apply that knowledge to build real-time stream processing solutions.

Real-Time Stream Processing Using Apache Spark 3 for Scala Developers
Delivered Online On Demand
£22.99

DP-203T00 Data Engineering on Microsoft Azure

By Nexus Human

Duration 4 Days 24 CPD hours This course is intended for The primary audience for this course is data professionals, data architects, and business intelligence professionals who want to learn about data engineering and building analytical solutions using data platform technologies that exist on Microsoft Azure. The secondary audience for this course includes data analysts and data scientists who work with analytical solutions built on Microsoft Azure. In this course, the student will learn how to implement and manage data engineering workloads on Microsoft Azure, using Azure services such as Azure Synapse Analytics, Azure Data Lake Storage Gen2, Azure Stream Analytics, Azure Databricks, and others. The course focuses on common data engineering tasks such as orchestrating data transfer and transformation pipelines, working with data files in a data lake, creating and loading relational data warehouses, capturing and aggregating streams of real-time data, and tracking data assets and lineage. Prerequisites Successful students start this course with knowledge of cloud computing and core data concepts and professional experience with data solutions. AZ-900T00 Microsoft Azure Fundamentals DP-900T00 Microsoft Azure Data Fundamentals 1 - INTRODUCTION TO DATA ENGINEERING ON AZURE * What is data engineering * Important data engineering concepts * Data engineering in Microsoft Azure 2 - INTRODUCTION TO AZURE DATA LAKE STORAGE GEN2 * Understand Azure Data Lake Storage Gen2 * Enable Azure Data Lake Storage Gen2 in Azure Storage * Compare Azure Data Lake Store to Azure Blob storage * Understand the stages for processing big data * Use Azure Data Lake Storage Gen2 in data analytics workloads 3 - INTRODUCTION TO AZURE SYNAPSE ANALYTICS * What is Azure Synapse Analytics * How Azure Synapse Analytics works * When to use Azure Synapse Analytics 4 - USE AZURE SYNAPSE SERVERLESS SQL POOL TO QUERY FILES IN A DATA LAKE * Understand Azure Synapse serverless SQL pool capabilities and use cases * Query files using a serverless SQL pool * Create external database objects 5 - USE AZURE SYNAPSE SERVERLESS SQL POOLS TO TRANSFORM DATA IN A DATA LAKE * Transform data files with the CREATE EXTERNAL TABLE AS SELECT statement * Encapsulate data transformations in a stored procedure * Include a data transformation stored procedure in a pipeline 6 - CREATE A LAKE DATABASE IN AZURE SYNAPSE ANALYTICS * Understand lake database concepts * Explore database templates * Create a lake database * Use a lake database 7 - ANALYZE DATA WITH APACHE SPARK IN AZURE SYNAPSE ANALYTICS * Get to know Apache Spark * Use Spark in Azure Synapse Analytics * Analyze data with Spark * Visualize data with Spark 8 - TRANSFORM DATA WITH SPARK IN AZURE SYNAPSE ANALYTICS * Modify and save dataframes * Partition data files * Transform data with SQL 9 - USE DELTA LAKE IN AZURE SYNAPSE ANALYTICS * Understand Delta Lake * Create Delta Lake tables * Create catalog tables * Use Delta Lake with streaming data * Use Delta Lake in a SQL pool 10 - ANALYZE DATA IN A RELATIONAL DATA WAREHOUSE * Design a data warehouse schema * Create data warehouse tables * Load data warehouse tables * Query a data warehouse 11 - LOAD DATA INTO A RELATIONAL DATA WAREHOUSE * Load staging tables * Load dimension tables * Load time dimension tables * Load slowly changing dimensions * Load fact tables * Perform post load optimization 12 - BUILD A DATA PIPELINE IN AZURE SYNAPSE ANALYTICS * Understand pipelines in Azure Synapse Analytics * Create a pipeline in Azure Synapse Studio * Define data flows * Run a pipeline 13 - USE SPARK NOTEBOOKS IN AN AZURE SYNAPSE PIPELINE * Understand Synapse Notebooks and Pipelines * Use a Synapse notebook activity in a pipeline * Use parameters in a notebook 14 - PLAN HYBRID TRANSACTIONAL AND ANALYTICAL PROCESSING USING AZURE SYNAPSE ANALYTICS * Understand hybrid transactional and analytical processing patterns * Describe Azure Synapse Link 15 - IMPLEMENT AZURE SYNAPSE LINK WITH AZURE COSMOS DB * Enable Cosmos DB account to use Azure Synapse Link * Create an analytical store enabled container * Create a linked service for Cosmos DB * Query Cosmos DB data with Spark * Query Cosmos DB with Synapse SQL 16 - IMPLEMENT AZURE SYNAPSE LINK FOR SQL * What is Azure Synapse Link for SQL? * Configure Azure Synapse Link for Azure SQL Database * Configure Azure Synapse Link for SQL Server 2022 17 - GET STARTED WITH AZURE STREAM ANALYTICS * Understand data streams * Understand event processing * Understand window functions 18 - INGEST STREAMING DATA USING AZURE STREAM ANALYTICS AND AZURE SYNAPSE ANALYTICS * Stream ingestion scenarios * Configure inputs and outputs * Define a query to select, filter, and aggregate data * Run a job to ingest data 19 - VISUALIZE REAL-TIME DATA WITH AZURE STREAM ANALYTICS AND POWER BI * Use a Power BI output in Azure Stream Analytics * Create a query for real-time visualization * Create real-time data visualizations in Power BI 20 - INTRODUCTION TO MICROSOFT PURVIEW * What is Microsoft Purview? * How Microsoft Purview works * When to use Microsoft Purview 21 - INTEGRATE MICROSOFT PURVIEW AND AZURE SYNAPSE ANALYTICS * Catalog Azure Synapse Analytics data assets in Microsoft Purview * Connect Microsoft Purview to an Azure Synapse Analytics workspace * Search a Purview catalog in Synapse Studio * Track data lineage in pipelines 22 - EXPLORE AZURE DATABRICKS * Get started with Azure Databricks * Identify Azure Databricks workloads * Understand key concepts 23 - USE APACHE SPARK IN AZURE DATABRICKS * Get to know Spark * Create a Spark cluster * Use Spark in notebooks * Use Spark to work with data files * Visualize data 24 - RUN AZURE DATABRICKS NOTEBOOKS WITH AZURE DATA FACTORY * Understand Azure Databricks notebooks and pipelines * Create a linked service for Azure Databricks * Use a Notebook activity in a pipeline * Use parameters in a notebook ADDITIONAL COURSE DETAILS: Nexus Humans DP-203T00 Data Engineering on Microsoft Azure training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the DP-203T00 Data Engineering on Microsoft Azure course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

DP-203T00 Data Engineering on Microsoft Azure
Delivered Online5 days, Jun 24th, 13:00 + 4 more
£2380

Real-Time Stream Processing Using Apache Spark 3 for Python Developers

By Packt

Get to grips with real-time stream processing using PySpark as well as Spark structured streaming and apply that knowledge to build stream processing solutions. This course is example-driven and follows a working session-like approach.

Real-Time Stream Processing Using Apache Spark 3 for Python Developers
Delivered Online On Demand
£93.99

Apache Spark with Scala - Hands-On with Big Data!

By Packt

This is a comprehensive and practical Apache Spark course. In this course, you will learn and master the art of framing data analysis problems as Spark problems through 20+ hands-on examples, and then scale them up to run on cloud computing services. Explore Spark 3, IntelliJ, Structured Streaming, and a stronger focus on the DataSet API.

Apache Spark with Scala - Hands-On with Big Data!
Delivered Online On Demand
£74.99

DP-900T00 Microsoft Azure Data Fundamentals

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for The audience for this course is individuals who want to learn the fundamentals of database concepts in a cloud environment, get basic skilling in cloud data services, and build their foundational knowledge of cloud data services within Microsoft Azure. Overview Describe core data concepts Identify considerations for relational data on Azure Describe considerations for working with non-relational data on Azure Describe an analytics workload on Azure In this course, students will gain foundational knowledge of core data concepts and related Microsoft Azure data services. Students will learn about core data concepts such as relational, non-relational, big data, and analytics, and build their foundational knowledge of cloud data services within Microsoft Azure. Students will explore fundamental relational data concepts and relational database services in Azure. They will explore Azure storage for non-relational data and the fundamentals of Azure Cosmos DB. Students will learn about large-scale data warehousing, real-time analytics, and data visualization. 1 - EXPLORE CORE DATA CONCEPTS * Identify data formats * Explore file storage * Explore databases * Explore transactional data processing * Explore analytical data processing 2 - EXPLORE DATA ROLES AND SERVICES * Explore job roles in the world of data * Identify data services 3 - EXPLORE FUNDAMENTAL RELATIONAL DATA CONCEPTS * Understand relational data * Understand normalization * Explore SQL * Describe database objects 4 - EXPLORE RELATIONAL DATABASE SERVICES IN AZURE * Describe Azure SQL services and capabilities * Describe Azure services for open-source databases 5 - EXPLORE AZURE STORAGE FOR NON-RELATIONAL DATA * Explore Azure blob storage * Explore Azure DataLake Storage Gen2 * Explore Azure Files * Explore Azure Tables 6 - EXPLORE FUNDAMENTALS OF AZURE COSMOS DB * Describe Azure Cosmos DB * Identify Azure Cosmos DB APIs 7 - EXPLORE FUNDAMENTALS OF LARGE-SCALE DATA WAREHOUSING * Describe data warehousing architecture * Explore data ingestion pipelines * Explore analytical data stores 8 - EXPLORE FUNDAMENTALS OF REAL-TIME ANALYTICS * Understand batch and stream processing * Explore common elements of stream processing architecture * Explore Azure Stream Analytics * Explore Apache Spark on Microsoft Azure 9 - EXPLORE FUNDAMENTALS OF DATA VISUALIZATION * Describe Power BI tools and workflow * Describe core concepts of data modeling * Describe considerations for data visualization

DP-900T00 Microsoft Azure Data Fundamentals
Delivered OnlineTwo days, Jun 24th, 13:00 + 3 more
£595

Spark Programming in Python for Beginners with Apache Spark 3

By Packt

Advance your data skills by mastering Spark programming in Python. This beginner's level course will help you understand the core concepts related to Apache Spark 3 and provide you with knowledge of applying those concepts to build data engineering solutions.

Spark Programming in Python for Beginners with Apache Spark 3
Delivered Online On Demand
£37.99

Mastering Scala with Apache Spark for the Modern Data Enterprise (TTSK7520)

By Nexus Human

Duration 5 Days 30 CPD hours This course is intended for This intermediate and beyond level course is geared for experienced technical professionals in various roles, such as developers, data analysts, data engineers, software engineers, and machine learning engineers who want to leverage Scala and Spark to tackle complex data challenges and develop scalable, high-performance applications across diverse domains. Practical programming experience is required to participate in the hands-on labs. Overview Working in a hands-on learning environment led by our expert instructor you'll: Develop a basic understanding of Scala and Apache Spark fundamentals, enabling you to confidently create scalable and high-performance applications. Learn how to process large datasets efficiently, helping you handle complex data challenges and make data-driven decisions. Gain hands-on experience with real-time data streaming, allowing you to manage and analyze data as it flows into your applications. Acquire practical knowledge of machine learning algorithms using Spark MLlib, empowering you to create intelligent applications and uncover hidden insights. Master graph processing with GraphX, enabling you to analyze and visualize complex relationships in your data. Discover generative AI technologies using GPT with Spark and Scala, opening up new possibilities for automating content generation and enhancing data analysis. Embark on a journey to master the world of big data with our immersive course on Scala and Spark! Mastering Scala with Apache Spark for the Modern Data Enterprise is a five day hands on course designed to provide you with the essential skills and tools to tackle complex data projects using Scala programming language and Apache Spark, a high-performance data processing engine. Mastering these technologies will enable you to perform a wide range of tasks, from data wrangling and analytics to machine learning and artificial intelligence, across various industries and applications.Guided by our expert instructor, you?ll explore the fundamentals of Scala programming and Apache Spark while gaining valuable hands-on experience with Spark programming, RDDs, DataFrames, Spark SQL, and data sources. You?ll also explore Spark Streaming, performance optimization techniques, and the integration of popular external libraries, tools, and cloud platforms like AWS, Azure, and GCP. Machine learning enthusiasts will delve into Spark MLlib, covering basics of machine learning algorithms, data preparation, feature extraction, and various techniques such as regression, classification, clustering, and recommendation systems. INTRODUCTION TO SCALA * Brief history and motivation * Differences between Scala and Java * Basic Scala syntax and constructs * Scala's functional programming features INTRODUCTION TO APACHE SPARK * Overview and history * Spark components and architecture * Spark ecosystem * Comparing Spark with other big data frameworks BASICS OF SPARK PROGRAMMING SPARKCONTEXT AND SPARKSESSION * Resilient Distributed Datasets (RDDs) * Transformations and Actions * Working with DataFrames SPARK SQL AND DATA SOURCES * Spark SQL library and its advantages * Structured and semi-structured data sources * Reading and writing data in various formats (CSV, JSON, Parquet, Avro, etc.) * Data manipulation using SQL queries BASIC RDD OPERATIONS * Creating and manipulating RDDs * Common transformations and actions on RDDs * Working with key-value data BASIC DATAFRAME AND DATASET OPERATIONS * Creating and manipulating DataFrames and Datasets * Column operations and functions * Filtering, sorting, and aggregating data INTRODUCTION TO SPARK STREAMING * Overview of Spark Streaming * Discretized Stream (DStream) operations * Windowed operations and stateful processing PERFORMANCE OPTIMIZATION BASICS * Best practices for efficient Spark code * Broadcast variables and accumulators * Monitoring Spark applications INTEGRATING EXTERNAL LIBRARIES AND TOOLS, SPARK STREAMING * Using popular external libraries, such as Hadoop and HBase * Integrating with cloud platforms: AWS, Azure, GCP * Connecting to data storage systems: HDFS, S3, Cassandra, etc. INTRODUCTION TO MACHINE LEARNING BASICS * Overview of machine learning * Supervised and unsupervised learning * Common algorithms and use cases INTRODUCTION TO SPARK MLLIB * Overview of Spark MLlib * MLlib's algorithms and utilities * Data preparation and feature extraction LINEAR REGRESSION AND CLASSIFICATION * Linear regression algorithm * Logistic regression for classification * Model evaluation and performance metrics CLUSTERING ALGORITHMS * Overview of clustering algorithms * K-means clustering * Model evaluation and performance metrics COLLABORATIVE FILTERING AND RECOMMENDATION SYSTEMS * Overview of recommendation systems * Collaborative filtering techniques * Implementing recommendations with Spark MLlib INTRODUCTION TO GRAPH PROCESSING * Overview of graph processing * Use cases and applications of graph processing * Graph representations and operations * Introduction to Spark GraphX * Overview of GraphX * Creating and transforming graphs * Graph algorithms in GraphX BIG DATA INNOVATION! USING GPT AND GENERATIVE AI TECHNOLOGIES WITH SPARK AND SCALA * Overview of generative AI technologies * Integrating GPT with Spark and Scala * Practical applications and use cases Bonus Topics / Time Permitting INTRODUCTION TO SPARK NLP * Overview of Spark NLP Preprocessing text data * Text classification and sentiment analysis PUTTING IT ALL TOGETHER * Work on a capstone project that integrates multiple aspects of the course, including data processing, machine learning, graph processing, and generative AI technologies.

Mastering Scala with Apache Spark for the Modern Data Enterprise (TTSK7520)
Delivered on-request, onlineDelivered Online
Price on Enquiry

Apache Spark 3 for Data Engineering and Analytics with Python

By Packt

This course primarily focuses on explaining the concepts of Python and PySpark. It will help you enhance your data analysis skills using structured Spark DataFrames APIs.

Apache Spark 3 for Data Engineering and Analytics with Python
Delivered Online On Demand
£41.99

Educators matching "Apache Spark"

Show all 7
Nobleprog Pakistan

nobleprog pakistan

NobleProg is an international training and consultancy group, delivering high quality courses to every sector, covering: Artificial Intelligence, IT, Management, Applied Statistics. Over the last 17 years, we have trained more than 50,000 people from over 6000 companies and organisations. Our courses include classroom (both public and closed) and instructor-led online giving you choice and flexibility to suit your time, budget and level of expertise. We practice what we preach – we use a great deal of the technologies and methods that we teach, and continuously upgrade and improve our courses, keeping up to date with all the latest developments. Our trainers are hand picked and have been through rigorous checks and interviews, and all courses are evaluated by delegates ensuring continuous feedback and improvement. NobleProg in numbers 17 + years of experience 15 + offices all over the world 1000 + trainers cooperating with NobleProg 1400 + course outlines offered companies 6100 + companies that entrusted us satisfied participant 58 k. + satisfied participants NobleProg - The World’s Local Training Provider Our mission is to provide comprehensive training and consultancy solutions all over the world, in an effective and accessible way, tailored to consumers’ needs . We offer practical, real-world knowledge supported by a full understanding of the theory. Our expert trainers are skilled in the latest knowledge transfer techniques, blending presentation, demonstration and hands-on learning. We understand that our learners are excited to be gaining new skills and we thrive off that energy to deliver exceptional training events. Investing in upskilling or reskilling with NobleProg means you stay ahead. Our catalogue is constantly evolving and we offer the most in-demand courses, Java, JavaScript, SQL, Visual Basic for Applications (VBA), as well as Apache Spark, OpenStack, TensorFlow, Selenium, Artificial Intelligence, Data Analysis. Our offer consists of more than 1,400 training outlines covering more than 120 technologies. At NobleProg we emphasis a need of not only following the latest technological trends, but also anticipating changes. We focus on delivering professional skills and certifications that will have a real impact. See what sets us apart >> NobleProg's history NobleProg was established in 2005 in Krakow, Poland, and has gradually expanded its operations to other global markets since. In just two years the first international branch was opened in London. The overwhelming potential of NobleProg combined with the rising need for self-development programs, especially in the field of technological skills, prompted the company to change the business model into a franchise. By doing so, in a short period of time the company allowed a number of people passionate about education and new technologies to join the NobleProg Team. With each year the territorial reach of NobleProg was further expanding and we now have offices on every continent. NobleProg is the World's Local Training Provider.

Nexus Human

nexus human

London

Nexus Human, established over 20 years ago, stands as a pillar of excellence in the realm of IT and Business Skills Training and education in Ireland and the UK.  For over two decades, Nexus Human has been a steadfast source of reliable and high-quality training solutions, catering to a diverse range of professional and educational needs. With a strong reputation in the Training Industry, Nexus Human has consistently demonstrated its commitment to equipping individuals and organisations with the skills and knowledge required to thrive in today's dynamic world.  Our training programs span a wide spectrum, encompassing IT certifications, business skills, and much more.   What sets Nexus Human apart is our unwavering dedication to staying at the forefront of industry trends and technology advancements.  Our expert instructors, coupled with cutting-edge training resources, ensure that students receive the most up-to-date and relevant knowledge available. The impact of Nexus Human extends far and wide, helping individuals enhance their career prospects and aiding businesses in achieving their goals.  This 20-year journey has solidified our institution's standing as a trusted partner in personal and professional growth, offering reliable, excellent training that continues to shape the future.  Whether you seek to upskill, reskill, or simply stay ahead of the curve, Nexus Human is the place to turn for an educational experience marked by quality, reliability, and innovation.