• Professional Development
  • Medicine & Nursing
  • Arts & Crafts
  • Health & Wellbeing
  • Personal Development

234 Big Data courses delivered Online

Streaming Big Data with Spark Streaming, Scala, and Spark 3!

By Packt

In this course, we will process massive streams of real-time data using Spark Streaming and create Spark applications using the Scala programming language (v2.12). We will also get our hands-on with some real live Twitter data, simulated streams of Apache access logs, and even data used to train machine learning models.

Streaming Big Data with Spark Streaming, Scala, and Spark 3!
Delivered Online On Demand
£74.99

Apache Spark with Scala - Hands-On with Big Data!

By Packt

This is a comprehensive and practical Apache Spark course. In this course, you will learn and master the art of framing data analysis problems as Spark problems through 20+ hands-on examples, and then scale them up to run on cloud computing services. Explore Spark 3, IntelliJ, Structured Streaming, and a stronger focus on the DataSet API.

Apache Spark with Scala - Hands-On with Big Data!
Delivered Online On Demand
£74.99

Professional Certificate Course in Big Data Infrastructure in London 2024

4.9(261)

By Metropolitan School of Business & Management UK

Dive into the heart of Big Data Infrastructure, exploring storage systems, distributed file frameworks, and processing paradigms. This course provides a comprehensive understanding of key components like HDFS, Apache Spark, and Cassandra, offering insights into their architecture, use cases, and real-world applications. This course is a deep dive into the complex landscape of Big Data Infrastructure. From unravelling the architecture of Apache Spark to dissecting the benefits of distributed file systems, participants gain expertise in assessing, comparing, and implementing various Big Data storage and processing systems. Scalability, fault-tolerance, and industry-specific case studies add practical depth to theoretical knowledge. After the successful completion of this course, you will be able to: * Understand the Components of Big Data Infrastructure, Including Storage Systems, Distributed File Systems, and Processing Frameworks. * Identify the Characteristics and Benefits of Distributed File Systems Such as Hadoop Distributed File System (H.D.F.S). * Describe the Architecture and Capabilities of Apache Spark and its Role in Big Data Processing. * Recognise the Use Cases and Benefits of Apache Cassandra as a Distributed N..O.S.Q.L Database. * Compare and Contrast Different Big Data Storage and Processing Systems Such as Hadoop, Spark, and Cassandra. * Understand the Scalability and Fault-tolerance Mechanisms Used in Big Data Infrastructure, Such as Sharding and Replication. * Appreciate the Challenges Associated with Deploying and Managing Big Data Infrastructure, Such as Hardware and Software Configuration and Security Considerations. Explore the intricacies of Big Data Infrastructure, from understanding storage systems to unraveling the nuances of distributed file frameworks and processing engines. Gain a comprehensive view of scalability, fault-tolerance mechanisms, and industry-specific challenges through engaging case studies. Equip yourself to navigate the dynamic landscape of Big Data with confidence and expertise. * VIDEO - COURSE STRUCTURE AND ASSESSMENT GUIDELINES Watch this video to gain further insight. * NAVIGATING THE MSBM STUDY PORTAL Watch this video to gain further insight. * INTERACTING WITH LECTURES/LEARNING COMPONENTS Watch this video to gain further insight. * BIG DATA INFRASTRUCTURE Self-paced pre-recorded learning content on this topic. * BIG DATA INFRASTRUCTURE Put your knowledge to the test with this quiz. Read each question carefully and choose the response that you feel is correct. All MSBM courses are accredited by the relevant partners and awarding bodies. Please refer to MSBM accreditation in about us for more details. There are no strict entry requirements for this course. Work experience will be an added advantage to understanding the content of the course. The certificate is designed to enhance the learner's knowledge in the field. This certificate is for everyone who is eager to know more and get updated on current ideas in their respective field. We recommend this certificate for the following audience. * Big Data Infrastructure Engineer * Hadoop Administrator * Spark Developer * Cassandra Database Administrator * Big Data Solutions Architect * Data Infrastructure Manager * NoSQL Database Analyst * Big Data Consultant AVERAGE COMPLETION TIME 2 Weeks ACCREDITATION 3 CPD Hours LEVEL Advanced START TIME Anytime 100% ONLINE Study online with ease. UNLIMITED ACCESS 24/7 unlimited access with pre-recorded lectures. LOW FEES Our fees are low and easy to pay online.

Professional Certificate Course in Big Data Infrastructure in London 2024
Delivered Online On Demand
£28

PySpark and AWS: Master Big Data with PySpark and AWS

By Packt

The course is crafted to reflect the most in-demand workplace skills. It will help you understand all the essential concepts and methodologies with regards to PySpark. This course provides a detailed compilation of all the basics, which will motivate you to make quick progress and experience much more than what you have learned.

PySpark and AWS: Master Big Data with PySpark and AWS
Delivered Online On Demand
£101.99

Tableau Desktop - Part 1

By Nexus Human

Duration 2 Days 12 CPD hours Overview Identify and configure basic functions of Tableau. Connect to data sources, import data into Tableau, and save Tableau files Create views and customize data in visualizations. Manage, sort, and group data. Save and share data sources and workbooks. Filter data in views. Customize visualizations with annotations, highlights, and advanced features. Create and enhance dashboards in Tableau. Create and enhance stories in Tableau As technology progresses and becomes more interwoven with our businesses and lives, more and more data is collected about business and personal activities. This era of "big data" has exploded due to the rise of cloud computing, which provides an abundance of computational power and storage, allowing organizations of all sorts to capture and store data. Leveraging that data effectively can provide timely insights and competitive advantage. The creation of data-backed visualizations is a key way data scientists, or any professional, can explore, analyze, and report insights and trends from data. Tableau© software is designed for this purpose. Tableau was built to connect to a wide range of data sources and allows users to quickly create visualizations of connected data to gain insights, show trends, and create reports. Tableau's data connection capabilities and visualization features go far beyond those that can be found in spreadsheets, allowing users to create compelling and interactive worksheets, dashboards, and stories that bring data to life and turn data into thoughtful action. Prerequisites To ensure your success in this course, you should have experience managing data with Microsoft© Excel© or Google Sheets?. LESSON 1: TABLEAU FUNDAMENTALS * Topic A: Overview of Tableau * Topic B: Navigate and Configure Tableau LESSON 2: CONNECTING TO AND PREPARING DATA * Topic A: Connect to Data * Topic B: Build a Data Model * Topic C: Save Workbook Files * Topic D: Prepare Data for Analysis LESSON 3: EXPLORING DATA * Topic A: Create Views * Topic B: Customize Data in Visualizations LESSON 4: MANAGING, SORTING, AND GROUPING DATA * Topic A: Adjust Fields * Topic B: Sort Data * Topic C: Group Data LESSON 5: SAVING, PUBLISHING, AND SHARING DATA * Topic A: Save Data Sources * Topic B: Publish Data Sources and Visualizations * Topic C: Share Workbooks for Collaboration LESSON 6: FILTERING DATA * Topic A: Configure Worksheet Filters * Topic B: Apply Advanced Filter Options * Topic C: Create Interactive Filters LESSON 7: CUSTOMIZING VISUALIZATIONS * Topic A: Format and Annotate Views * Topic B: Emphasize Data in Visualizations * Topic C: Create Animated Workbooks * Topic D: Best Practices for Visual Design LESSON 8: CREATING DASHBOARDS IN TABLEAU * Topic A: Create Dashboards * Topic B: Enhance Dashboards with Actions * Topic C: Create Mobile Dashboards LESSON 9: CREATING STORIES IN TABLEAU * Topic A: Create Stories * Topic B: Enhance Stories with Tooltips

Tableau Desktop - Part 1
Delivered Online3 days, Jun 24th, 13:00
£1400

DP-900T00 Microsoft Azure Data Fundamentals

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for The audience for this course is individuals who want to learn the fundamentals of database concepts in a cloud environment, get basic skilling in cloud data services, and build their foundational knowledge of cloud data services within Microsoft Azure. Overview Describe core data concepts Identify considerations for relational data on Azure Describe considerations for working with non-relational data on Azure Describe an analytics workload on Azure In this course, students will gain foundational knowledge of core data concepts and related Microsoft Azure data services. Students will learn about core data concepts such as relational, non-relational, big data, and analytics, and build their foundational knowledge of cloud data services within Microsoft Azure. Students will explore fundamental relational data concepts and relational database services in Azure. They will explore Azure storage for non-relational data and the fundamentals of Azure Cosmos DB. Students will learn about large-scale data warehousing, real-time analytics, and data visualization. 1 - EXPLORE CORE DATA CONCEPTS * Identify data formats * Explore file storage * Explore databases * Explore transactional data processing * Explore analytical data processing 2 - EXPLORE DATA ROLES AND SERVICES * Explore job roles in the world of data * Identify data services 3 - EXPLORE FUNDAMENTAL RELATIONAL DATA CONCEPTS * Understand relational data * Understand normalization * Explore SQL * Describe database objects 4 - EXPLORE RELATIONAL DATABASE SERVICES IN AZURE * Describe Azure SQL services and capabilities * Describe Azure services for open-source databases 5 - EXPLORE AZURE STORAGE FOR NON-RELATIONAL DATA * Explore Azure blob storage * Explore Azure DataLake Storage Gen2 * Explore Azure Files * Explore Azure Tables 6 - EXPLORE FUNDAMENTALS OF AZURE COSMOS DB * Describe Azure Cosmos DB * Identify Azure Cosmos DB APIs 7 - EXPLORE FUNDAMENTALS OF LARGE-SCALE DATA WAREHOUSING * Describe data warehousing architecture * Explore data ingestion pipelines * Explore analytical data stores 8 - EXPLORE FUNDAMENTALS OF REAL-TIME ANALYTICS * Understand batch and stream processing * Explore common elements of stream processing architecture * Explore Azure Stream Analytics * Explore Apache Spark on Microsoft Azure 9 - EXPLORE FUNDAMENTALS OF DATA VISUALIZATION * Describe Power BI tools and workflow * Describe core concepts of data modeling * Describe considerations for data visualization

DP-900T00 Microsoft Azure Data Fundamentals
Delivered OnlineTwo days, Jun 24th, 13:00 + 3 more
£595

Designing and Building Big Data Applications

By Nexus Human

Duration 4 Days 24 CPD hours This course is intended for This course is best suited to developers, engineers, and architects who want to use use Hadoop and related tools to solve real-world problems. Overview Skills learned in this course include:Creating a data set with Kite SDKDeveloping custom Flume components for data ingestionManaging a multi-stage workflow with OozieAnalyzing data with CrunchWriting user-defined functions for Hive and ImpalaWriting user-defined functions for Hive and ImpalaIndexing data with Cloudera Search Cloudera University?s four-day course for designing and building Big Data applications prepares you to analyze and solve real-world problems using Apache Hadoop and associated tools in the enterprise data hub (EDH). INTRODUCTION APPLICATION ARCHITECTURE * Scenario Explanation * Understanding the Development Environment * Identifying and Collecting Input Data * Selecting Tools for Data Processing and Analysis * Presenting Results to the Use DEFINING & USING DATASETS * Metadata Management * What is Apache Avro? * Avro Schemas * Avro Schema Evolution * Selecting a File Format * Performance Considerations USING THE KITE SDK DATA MODULE * What is the Kite SDK? * Fundamental Data Module Concepts * Creating New Data Sets Using the Kite SDK * Loading, Accessing, and Deleting a Data Set IMPORTING RELATIONAL DATA WITH APACHE SQOOP * What is Apache Sqoop? * Basic Imports * Limiting Results * Improving Sqoop?s Performance * Sqoop 2 CAPTURING DATA WITH APACHE FLUME * What is Apache Flume? * Basic Flume Architecture * Flume Sources * Flume Sinks * Flume Configuration * Logging Application Events to Hadoop DEVELOPING CUSTOM FLUME COMPONENTS * Flume Data Flow and Common Extension Points * Custom Flume Sources * Developing a Flume Pollable Source * Developing a Flume Event-Driven Source * Custom Flume Interceptors * Developing a Header-Modifying Flume Interceptor * Developing a Filtering Flume Interceptor * Writing Avro Objects with a Custom Flume Interceptor MANAGING WORKFLOWS WITH APACHE OOZIE * The Need for Workflow Management * What is Apache Oozie? * Defining an Oozie Workflow * Validation, Packaging, and Deployment * Running and Tracking Workflows Using the CLI * Hue UI for Oozie PROCESSING DATA PIPELINES WITH APACHE CRUNCH * What is Apache Crunch? * Understanding the Crunch Pipeline * Comparing Crunch to Java MapReduce * Working with Crunch Projects * Reading and Writing Data in Crunch * Data Collection API Functions * Utility Classes in the Crunch API WORKING WITH TABLES IN APACHE HIVE * What is Apache Hive? * Accessing Hive * Basic Query Syntax * Creating and Populating Hive Tables * How Hive Reads Data * Using the RegexSerDe in Hive DEVELOPING USER-DEFINED FUNCTIONS * What are User-Defined Functions? * Implementing a User-Defined Function * Deploying Custom Libraries in Hive * Registering a User-Defined Function in Hive EXECUTING INTERACTIVE QUERIES WITH IMPALA * What is Impala? * Comparing Hive to Impala * Running Queries in Impala * Support for User-Defined Functions * Data and Metadata Management UNDERSTANDING CLOUDERA SEARCH * What is Cloudera Search? * Search Architecture * Supported Document Formats INDEXING DATA WITH CLOUDERA SEARCH * Collection and Schema Management * Morphlines * Indexing Data in Batch Mode * Indexing Data in Near Real Time PRESENTING RESULTS TO USERS * Solr Query Syntax * Building a Search UI with Hue * Accessing Impala through JDBC * Powering a Custom Web Application with Impala and Search

Designing and Building Big Data Applications
Delivered on-request, onlineDelivered Online
Price on Enquiry

Scala & Spark-Master Big Data with Scala and Spark

By Packt

Scala is doubtless one of the most in-demand skills for data scientists and data engineers. This competitive course will teach you the essential concepts and methodologies of Scala with a lot of practical implementations.

Scala & Spark-Master Big Data with Scala and Spark
Delivered Online On Demand
£93.99

Master Big Data Ingestion and Analytics with Flume, Sqoop, Hive and Spark

By Packt

A complete course on Sqoop, Flume, and Hive: Ideal for achieving CCA175 and Hortonworks Spark Certification

Master Big Data Ingestion and Analytics with Flume, Sqoop, Hive and Spark
Delivered Online On Demand
£70.99

DP-203T00 Data Engineering on Microsoft Azure

By Nexus Human

Duration 4 Days 24 CPD hours This course is intended for The primary audience for this course is data professionals, data architects, and business intelligence professionals who want to learn about data engineering and building analytical solutions using data platform technologies that exist on Microsoft Azure. The secondary audience for this course includes data analysts and data scientists who work with analytical solutions built on Microsoft Azure. In this course, the student will learn how to implement and manage data engineering workloads on Microsoft Azure, using Azure services such as Azure Synapse Analytics, Azure Data Lake Storage Gen2, Azure Stream Analytics, Azure Databricks, and others. The course focuses on common data engineering tasks such as orchestrating data transfer and transformation pipelines, working with data files in a data lake, creating and loading relational data warehouses, capturing and aggregating streams of real-time data, and tracking data assets and lineage. Prerequisites Successful students start this course with knowledge of cloud computing and core data concepts and professional experience with data solutions. AZ-900T00 Microsoft Azure Fundamentals DP-900T00 Microsoft Azure Data Fundamentals 1 - INTRODUCTION TO DATA ENGINEERING ON AZURE * What is data engineering * Important data engineering concepts * Data engineering in Microsoft Azure 2 - INTRODUCTION TO AZURE DATA LAKE STORAGE GEN2 * Understand Azure Data Lake Storage Gen2 * Enable Azure Data Lake Storage Gen2 in Azure Storage * Compare Azure Data Lake Store to Azure Blob storage * Understand the stages for processing big data * Use Azure Data Lake Storage Gen2 in data analytics workloads 3 - INTRODUCTION TO AZURE SYNAPSE ANALYTICS * What is Azure Synapse Analytics * How Azure Synapse Analytics works * When to use Azure Synapse Analytics 4 - USE AZURE SYNAPSE SERVERLESS SQL POOL TO QUERY FILES IN A DATA LAKE * Understand Azure Synapse serverless SQL pool capabilities and use cases * Query files using a serverless SQL pool * Create external database objects 5 - USE AZURE SYNAPSE SERVERLESS SQL POOLS TO TRANSFORM DATA IN A DATA LAKE * Transform data files with the CREATE EXTERNAL TABLE AS SELECT statement * Encapsulate data transformations in a stored procedure * Include a data transformation stored procedure in a pipeline 6 - CREATE A LAKE DATABASE IN AZURE SYNAPSE ANALYTICS * Understand lake database concepts * Explore database templates * Create a lake database * Use a lake database 7 - ANALYZE DATA WITH APACHE SPARK IN AZURE SYNAPSE ANALYTICS * Get to know Apache Spark * Use Spark in Azure Synapse Analytics * Analyze data with Spark * Visualize data with Spark 8 - TRANSFORM DATA WITH SPARK IN AZURE SYNAPSE ANALYTICS * Modify and save dataframes * Partition data files * Transform data with SQL 9 - USE DELTA LAKE IN AZURE SYNAPSE ANALYTICS * Understand Delta Lake * Create Delta Lake tables * Create catalog tables * Use Delta Lake with streaming data * Use Delta Lake in a SQL pool 10 - ANALYZE DATA IN A RELATIONAL DATA WAREHOUSE * Design a data warehouse schema * Create data warehouse tables * Load data warehouse tables * Query a data warehouse 11 - LOAD DATA INTO A RELATIONAL DATA WAREHOUSE * Load staging tables * Load dimension tables * Load time dimension tables * Load slowly changing dimensions * Load fact tables * Perform post load optimization 12 - BUILD A DATA PIPELINE IN AZURE SYNAPSE ANALYTICS * Understand pipelines in Azure Synapse Analytics * Create a pipeline in Azure Synapse Studio * Define data flows * Run a pipeline 13 - USE SPARK NOTEBOOKS IN AN AZURE SYNAPSE PIPELINE * Understand Synapse Notebooks and Pipelines * Use a Synapse notebook activity in a pipeline * Use parameters in a notebook 14 - PLAN HYBRID TRANSACTIONAL AND ANALYTICAL PROCESSING USING AZURE SYNAPSE ANALYTICS * Understand hybrid transactional and analytical processing patterns * Describe Azure Synapse Link 15 - IMPLEMENT AZURE SYNAPSE LINK WITH AZURE COSMOS DB * Enable Cosmos DB account to use Azure Synapse Link * Create an analytical store enabled container * Create a linked service for Cosmos DB * Query Cosmos DB data with Spark * Query Cosmos DB with Synapse SQL 16 - IMPLEMENT AZURE SYNAPSE LINK FOR SQL * What is Azure Synapse Link for SQL? * Configure Azure Synapse Link for Azure SQL Database * Configure Azure Synapse Link for SQL Server 2022 17 - GET STARTED WITH AZURE STREAM ANALYTICS * Understand data streams * Understand event processing * Understand window functions 18 - INGEST STREAMING DATA USING AZURE STREAM ANALYTICS AND AZURE SYNAPSE ANALYTICS * Stream ingestion scenarios * Configure inputs and outputs * Define a query to select, filter, and aggregate data * Run a job to ingest data 19 - VISUALIZE REAL-TIME DATA WITH AZURE STREAM ANALYTICS AND POWER BI * Use a Power BI output in Azure Stream Analytics * Create a query for real-time visualization * Create real-time data visualizations in Power BI 20 - INTRODUCTION TO MICROSOFT PURVIEW * What is Microsoft Purview? * How Microsoft Purview works * When to use Microsoft Purview 21 - INTEGRATE MICROSOFT PURVIEW AND AZURE SYNAPSE ANALYTICS * Catalog Azure Synapse Analytics data assets in Microsoft Purview * Connect Microsoft Purview to an Azure Synapse Analytics workspace * Search a Purview catalog in Synapse Studio * Track data lineage in pipelines 22 - EXPLORE AZURE DATABRICKS * Get started with Azure Databricks * Identify Azure Databricks workloads * Understand key concepts 23 - USE APACHE SPARK IN AZURE DATABRICKS * Get to know Spark * Create a Spark cluster * Use Spark in notebooks * Use Spark to work with data files * Visualize data 24 - RUN AZURE DATABRICKS NOTEBOOKS WITH AZURE DATA FACTORY * Understand Azure Databricks notebooks and pipelines * Create a linked service for Azure Databricks * Use a Notebook activity in a pipeline * Use parameters in a notebook ADDITIONAL COURSE DETAILS: Nexus Humans DP-203T00 Data Engineering on Microsoft Azure training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the DP-203T00 Data Engineering on Microsoft Azure course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

DP-203T00 Data Engineering on Microsoft Azure
Delivered Online5 days, Jun 24th, 13:00 + 4 more
£2380