Apache Spark courses in Glasgow

Apache Spark Glasgow

We couldn't find any listings for your search. Explore our online options below.

Know someone teaching this? Help them become an Educator on Cademy.

🔥 Limited Time Offer 🔥

Get a 10% discount on your first order when you use this promo code at checkout: MAY24BAN3X

Online Options

Show all 36

Spark Programming in Scala for Beginners with Apache Spark 3

By Packt

This course does not require any prior knowledge of Apache Spark or Hadoop. The author explains Spark architecture and fundamental concepts to help you come up to speed and grasp the content of this course. The course will help you understand Spark programming and apply that knowledge to build data engineering solutions.

Spark Programming in Scala for Beginners with Apache Spark 3

Delivered Online On Demand

£14.99

Apache Spark 3 Advance Skills for Cracking Job Interviews

By Packt

A carefully structured advanced-level course on Apache Spark 3 to help you clear your job interviews. This course covers advanced topics and concepts that are part of the Databricks Spark certification exam. Boost your skills in Spark 3 architecture and memory management.

Apache Spark 3 Advance Skills for Cracking Job Interviews

Delivered Online On Demand

£67.99

Real-Time Stream Processing Using Apache Spark 3 for Scala Developers

By Packt

Learn the process to design and develop big data engineering projects using Apache Spark. This example-driven advanced-level course will help you understand real-time stream processing using Apache Spark and you can apply that knowledge to build real-time stream processing solutions.

Real-Time Stream Processing Using Apache Spark 3 for Scala Developers

Delivered Online On Demand

£22.99

Real-Time Stream Processing Using Apache Spark 3 for Python Developers

By Packt

Get to grips with real-time stream processing using PySpark as well as Spark structured streaming and apply that knowledge to build stream processing solutions. This course is example-driven and follows a working session-like approach.

Real-Time Stream Processing Using Apache Spark 3 for Python Developers

Delivered Online On Demand

£93.99

Apache Spark with Scala - Hands-On with Big Data!

By Packt

This is a comprehensive and practical Apache Spark course. In this course, you will learn and master the art of framing data analysis problems as Spark problems through 20+ hands-on examples, and then scale them up to run on cloud computing services. Explore Spark 3, IntelliJ, Structured Streaming, and a stronger focus on the DataSet API.

Apache Spark with Scala - Hands-On with Big Data!

Delivered Online On Demand

£74.99

Spark Programming in Python for Beginners with Apache Spark 3

By Packt

Advance your data skills by mastering Spark programming in Python. This beginner's level course will help you understand the core concepts related to Apache Spark 3 and provide you with knowledge of applying those concepts to build data engineering solutions.

Spark Programming in Python for Beginners with Apache Spark 3

Delivered Online On Demand

£37.99

Mastering Scala with Apache Spark for the Modern Data Enterprise (TTSK7520)

By Nexus Human

Duration 5 Days 30 CPD hours This course is intended for This intermediate and beyond level course is geared for experienced technical professionals in various roles, such as developers, data analysts, data engineers, software engineers, and machine learning engineers who want to leverage Scala and Spark to tackle complex data challenges and develop scalable, high-performance applications across diverse domains. Practical programming experience is required to participate in the hands-on labs. Overview Working in a hands-on learning environment led by our expert instructor you'll: Develop a basic understanding of Scala and Apache Spark fundamentals, enabling you to confidently create scalable and high-performance applications. Learn how to process large datasets efficiently, helping you handle complex data challenges and make data-driven decisions. Gain hands-on experience with real-time data streaming, allowing you to manage and analyze data as it flows into your applications. Acquire practical knowledge of machine learning algorithms using Spark MLlib, empowering you to create intelligent applications and uncover hidden insights. Master graph processing with GraphX, enabling you to analyze and visualize complex relationships in your data. Discover generative AI technologies using GPT with Spark and Scala, opening up new possibilities for automating content generation and enhancing data analysis. Embark on a journey to master the world of big data with our immersive course on Scala and Spark! Mastering Scala with Apache Spark for the Modern Data Enterprise is a five day hands on course designed to provide you with the essential skills and tools to tackle complex data projects using Scala programming language and Apache Spark, a high-performance data processing engine. Mastering these technologies will enable you to perform a wide range of tasks, from data wrangling and analytics to machine learning and artificial intelligence, across various industries and applications.Guided by our expert instructor, you?ll explore the fundamentals of Scala programming and Apache Spark while gaining valuable hands-on experience with Spark programming, RDDs, DataFrames, Spark SQL, and data sources. You?ll also explore Spark Streaming, performance optimization techniques, and the integration of popular external libraries, tools, and cloud platforms like AWS, Azure, and GCP. Machine learning enthusiasts will delve into Spark MLlib, covering basics of machine learning algorithms, data preparation, feature extraction, and various techniques such as regression, classification, clustering, and recommendation systems. INTRODUCTION TO SCALA * Brief history and motivation * Differences between Scala and Java * Basic Scala syntax and constructs * Scala's functional programming features INTRODUCTION TO APACHE SPARK * Overview and history * Spark components and architecture * Spark ecosystem * Comparing Spark with other big data frameworks BASICS OF SPARK PROGRAMMING SPARKCONTEXT AND SPARKSESSION * Resilient Distributed Datasets (RDDs) * Transformations and Actions * Working with DataFrames SPARK SQL AND DATA SOURCES * Spark SQL library and its advantages * Structured and semi-structured data sources * Reading and writing data in various formats (CSV, JSON, Parquet, Avro, etc.) * Data manipulation using SQL queries BASIC RDD OPERATIONS * Creating and manipulating RDDs * Common transformations and actions on RDDs * Working with key-value data BASIC DATAFRAME AND DATASET OPERATIONS * Creating and manipulating DataFrames and Datasets * Column operations and functions * Filtering, sorting, and aggregating data INTRODUCTION TO SPARK STREAMING * Overview of Spark Streaming * Discretized Stream (DStream) operations * Windowed operations and stateful processing PERFORMANCE OPTIMIZATION BASICS * Best practices for efficient Spark code * Broadcast variables and accumulators * Monitoring Spark applications INTEGRATING EXTERNAL LIBRARIES AND TOOLS, SPARK STREAMING * Using popular external libraries, such as Hadoop and HBase * Integrating with cloud platforms: AWS, Azure, GCP * Connecting to data storage systems: HDFS, S3, Cassandra, etc. INTRODUCTION TO MACHINE LEARNING BASICS * Overview of machine learning * Supervised and unsupervised learning * Common algorithms and use cases INTRODUCTION TO SPARK MLLIB * Overview of Spark MLlib * MLlib's algorithms and utilities * Data preparation and feature extraction LINEAR REGRESSION AND CLASSIFICATION * Linear regression algorithm * Logistic regression for classification * Model evaluation and performance metrics CLUSTERING ALGORITHMS * Overview of clustering algorithms * K-means clustering * Model evaluation and performance metrics COLLABORATIVE FILTERING AND RECOMMENDATION SYSTEMS * Overview of recommendation systems * Collaborative filtering techniques * Implementing recommendations with Spark MLlib INTRODUCTION TO GRAPH PROCESSING * Overview of graph processing * Use cases and applications of graph processing * Graph representations and operations * Introduction to Spark GraphX * Overview of GraphX * Creating and transforming graphs * Graph algorithms in GraphX BIG DATA INNOVATION! USING GPT AND GENERATIVE AI TECHNOLOGIES WITH SPARK AND SCALA * Overview of generative AI technologies * Integrating GPT with Spark and Scala * Practical applications and use cases Bonus Topics / Time Permitting INTRODUCTION TO SPARK NLP * Overview of Spark NLP Preprocessing text data * Text classification and sentiment analysis PUTTING IT ALL TOGETHER * Work on a capstone project that integrates multiple aspects of the course, including data processing, machine learning, graph processing, and generative AI technologies.

Mastering Scala with Apache Spark for the Modern Data Enterprise (TTSK7520)

Delivered on-request, onlineDelivered Online

Price on Enquiry

Apache Spark 3 for Data Engineering and Analytics with Python

By Packt

This course primarily focuses on explaining the concepts of Python and PySpark. It will help you enhance your data analysis skills using structured Spark DataFrames APIs.

Apache Spark 3 for Data Engineering and Analytics with Python

Delivered Online On Demand

£41.99

DP-601T00 Implementing a Lakehouse with Microsoft Fabric

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for The primary audience for this course is data professionals who are familiar with data modeling, extraction, and analytics. It is designed for professionals who are interested in gaining knowledge about Lakehouse architecture, the Microsoft Fabric platform, and how to enable end-to-end analytics using these technologies. Job role: Data Analyst, Data Engineer, Data Scientist Overview Describe end-to-end analytics in Microsoft Fabric Describe core features and capabilities of lakehouses in Microsoft Fabric Create a lakehouse Ingest data into files and tables in a lakehouse Query lakehouse tables with SQL Configure Spark in a Microsoft Fabric workspace Identify suitable scenarios for Spark notebooks and Spark jobs Use Spark dataframes to analyze and transform data Use Spark SQL to query data in tables and views Visualize data in a Spark notebook Understand Delta Lake and delta tables in Microsoft Fabric Create and manage delta tables using Spark Use Spark to query and transform data in delta tables Use delta tables with Spark structured streaming Describe Dataflow (Gen2) capabilities in Microsoft Fabric Create Dataflow (Gen2) solutions to ingest and transform data Include a Dataflow (Gen2) in a pipeline This course is designed to build your foundational skills in data engineering on Microsoft Fabric, focusing on the Lakehouse concept. This course will explore the powerful capabilities of Apache Spark for distributed data processing and the essential techniques for efficient data management, versioning, and reliability by working with Delta Lake tables. This course will also explore data ingestion and orchestration using Dataflows Gen2 and Data Factory pipelines. This course includes a combination of lectures and hands-on exercises that will prepare you to work with lakehouses in Microsoft Fabric. INTRODUCTION TO END-TO-END ANALYTICS USING MICROSOFT FABRIC * Explore end-to-end analytics with Microsoft Fabric * Data teams and Microsoft Fabric * Enable and use Microsoft Fabric * Knowledge Check GET STARTED WITH LAKEHOUSES IN MICROSOFT FABRIC * Explore the Microsoft Fabric Lakehouse * Work with Microsoft Fabric Lakehouses * Exercise - Create and ingest data with a Microsoft Fabric Lakehouse USE APACHE SPARK IN MICROSOFT FABRIC * Prepare to use Apache Spark * Run Spark code * Work with data in a Spark dataframe * Work with data using Spark SQL * Visualize data in a Spark notebook * Exercise - Analyze data with Apache Spark WORK WITH DELTA LAKE TABLES IN MICROSOFT FABRIC * Understand Delta Lake * Create delta tables * Work with delta tables in Spark * Use delta tables with streaming data * Exercise - Use delta tables in Apache Spark INGEST DATA WITH DATAFLOWS GEN2 IN MICROSOFT FABRIC * Understand Dataflows (Gen2) in Microsoft Fabric * Explore Dataflows (Gen2) in Microsoft Fabric * Integrate Dataflows (Gen2) and Pipelines in Microsoft Fabric * Exercise - Create and use a Dataflow (Gen2) in Microsoft Fabric

DP-601T00 Implementing a Lakehouse with Microsoft Fabric

Delivered OnlineTwo days, Aug 26th, 13:00 + 2 more

£595

DP-203T00 Data Engineering on Microsoft Azure

By Nexus Human

Duration 4 Days 24 CPD hours This course is intended for The primary audience for this course is data professionals, data architects, and business intelligence professionals who want to learn about data engineering and building analytical solutions using data platform technologies that exist on Microsoft Azure. The secondary audience for this course includes data analysts and data scientists who work with analytical solutions built on Microsoft Azure. In this course, the student will learn how to implement and manage data engineering workloads on Microsoft Azure, using Azure services such as Azure Synapse Analytics, Azure Data Lake Storage Gen2, Azure Stream Analytics, Azure Databricks, and others. The course focuses on common data engineering tasks such as orchestrating data transfer and transformation pipelines, working with data files in a data lake, creating and loading relational data warehouses, capturing and aggregating streams of real-time data, and tracking data assets and lineage. Prerequisites Successful students start this course with knowledge of cloud computing and core data concepts and professional experience with data solutions. AZ-900T00 Microsoft Azure Fundamentals DP-900T00 Microsoft Azure Data Fundamentals 1 - INTRODUCTION TO DATA ENGINEERING ON AZURE * What is data engineering * Important data engineering concepts * Data engineering in Microsoft Azure 2 - INTRODUCTION TO AZURE DATA LAKE STORAGE GEN2 * Understand Azure Data Lake Storage Gen2 * Enable Azure Data Lake Storage Gen2 in Azure Storage * Compare Azure Data Lake Store to Azure Blob storage * Understand the stages for processing big data * Use Azure Data Lake Storage Gen2 in data analytics workloads 3 - INTRODUCTION TO AZURE SYNAPSE ANALYTICS * What is Azure Synapse Analytics * How Azure Synapse Analytics works * When to use Azure Synapse Analytics 4 - USE AZURE SYNAPSE SERVERLESS SQL POOL TO QUERY FILES IN A DATA LAKE * Understand Azure Synapse serverless SQL pool capabilities and use cases * Query files using a serverless SQL pool * Create external database objects 5 - USE AZURE SYNAPSE SERVERLESS SQL POOLS TO TRANSFORM DATA IN A DATA LAKE * Transform data files with the CREATE EXTERNAL TABLE AS SELECT statement * Encapsulate data transformations in a stored procedure * Include a data transformation stored procedure in a pipeline 6 - CREATE A LAKE DATABASE IN AZURE SYNAPSE ANALYTICS * Understand lake database concepts * Explore database templates * Create a lake database * Use a lake database 7 - ANALYZE DATA WITH APACHE SPARK IN AZURE SYNAPSE ANALYTICS * Get to know Apache Spark * Use Spark in Azure Synapse Analytics * Analyze data with Spark * Visualize data with Spark 8 - TRANSFORM DATA WITH SPARK IN AZURE SYNAPSE ANALYTICS * Modify and save dataframes * Partition data files * Transform data with SQL 9 - USE DELTA LAKE IN AZURE SYNAPSE ANALYTICS * Understand Delta Lake * Create Delta Lake tables * Create catalog tables * Use Delta Lake with streaming data * Use Delta Lake in a SQL pool 10 - ANALYZE DATA IN A RELATIONAL DATA WAREHOUSE * Design a data warehouse schema * Create data warehouse tables * Load data warehouse tables * Query a data warehouse 11 - LOAD DATA INTO A RELATIONAL DATA WAREHOUSE * Load staging tables * Load dimension tables * Load time dimension tables * Load slowly changing dimensions * Load fact tables * Perform post load optimization 12 - BUILD A DATA PIPELINE IN AZURE SYNAPSE ANALYTICS * Understand pipelines in Azure Synapse Analytics * Create a pipeline in Azure Synapse Studio * Define data flows * Run a pipeline 13 - USE SPARK NOTEBOOKS IN AN AZURE SYNAPSE PIPELINE * Understand Synapse Notebooks and Pipelines * Use a Synapse notebook activity in a pipeline * Use parameters in a notebook 14 - PLAN HYBRID TRANSACTIONAL AND ANALYTICAL PROCESSING USING AZURE SYNAPSE ANALYTICS * Understand hybrid transactional and analytical processing patterns * Describe Azure Synapse Link 15 - IMPLEMENT AZURE SYNAPSE LINK WITH AZURE COSMOS DB * Enable Cosmos DB account to use Azure Synapse Link * Create an analytical store enabled container * Create a linked service for Cosmos DB * Query Cosmos DB data with Spark * Query Cosmos DB with Synapse SQL 16 - IMPLEMENT AZURE SYNAPSE LINK FOR SQL * What is Azure Synapse Link for SQL? * Configure Azure Synapse Link for Azure SQL Database * Configure Azure Synapse Link for SQL Server 2022 17 - GET STARTED WITH AZURE STREAM ANALYTICS * Understand data streams * Understand event processing * Understand window functions 18 - INGEST STREAMING DATA USING AZURE STREAM ANALYTICS AND AZURE SYNAPSE ANALYTICS * Stream ingestion scenarios * Configure inputs and outputs * Define a query to select, filter, and aggregate data * Run a job to ingest data 19 - VISUALIZE REAL-TIME DATA WITH AZURE STREAM ANALYTICS AND POWER BI * Use a Power BI output in Azure Stream Analytics * Create a query for real-time visualization * Create real-time data visualizations in Power BI 20 - INTRODUCTION TO MICROSOFT PURVIEW * What is Microsoft Purview? * How Microsoft Purview works * When to use Microsoft Purview 21 - INTEGRATE MICROSOFT PURVIEW AND AZURE SYNAPSE ANALYTICS * Catalog Azure Synapse Analytics data assets in Microsoft Purview * Connect Microsoft Purview to an Azure Synapse Analytics workspace * Search a Purview catalog in Synapse Studio * Track data lineage in pipelines 22 - EXPLORE AZURE DATABRICKS * Get started with Azure Databricks * Identify Azure Databricks workloads * Understand key concepts 23 - USE APACHE SPARK IN AZURE DATABRICKS * Get to know Spark * Create a Spark cluster * Use Spark in notebooks * Use Spark to work with data files * Visualize data 24 - RUN AZURE DATABRICKS NOTEBOOKS WITH AZURE DATA FACTORY * Understand Azure Databricks notebooks and pipelines * Create a linked service for Azure Databricks * Use a Notebook activity in a pipeline * Use parameters in a notebook ADDITIONAL COURSE DETAILS: Nexus Humans DP-203T00 Data Engineering on Microsoft Azure training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the DP-203T00 Data Engineering on Microsoft Azure course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

DP-203T00 Data Engineering on Microsoft Azure

Delivered Online5 days, Jun 24th, 13:00 + 4 more

£2380

Apache Spark courses in Glasgow

Online Options

Spark Programming in Scala for Beginners with Apache Spark 3

By Packt

Apache Spark 3 Advance Skills for Cracking Job Interviews

By Packt

Real-Time Stream Processing Using Apache Spark 3 for Scala Developers

By Packt

Real-Time Stream Processing Using Apache Spark 3 for Python Developers

By Packt

Apache Spark with Scala - Hands-On with Big Data!

By Packt

Spark Programming in Python for Beginners with Apache Spark 3

By Packt

Mastering Scala with Apache Spark for the Modern Data Enterprise (TTSK7520)

By Nexus Human

Apache Spark 3 for Data Engineering and Analytics with Python

By Packt

DP-601T00 Implementing a Lakehouse with Microsoft Fabric

By Nexus Human

DP-203T00 Data Engineering on Microsoft Azure

By Nexus Human

Search By Location