• Professional Development
  • Medicine & Nursing
  • Arts & Crafts
  • Health & Wellbeing
  • Personal Development

90 Big Data Analytics courses delivered Online

Beginning Data Analytics With R

By Nexus Human

Duration 3 Days 18 CPD hours This course is intended for This course is aimed at anyone who wants to harness the power of data analytics in their organization. Overview After completing this course delegates will be capable of writing effective R code to manipulate, analyse and visualise data to enable their organisations make better, data-driven decisions. This course teaches delegates with no prior programming or data analytics experience how to perform data manipulation, data analysis and data visualisation in R. COURSE OUTLINE * Becoming a world class data analytics practitioner requires mastery of the most sophisticated data analytics tools. The R programming language is one of the most powerful and flexible tools in the data analytics toolkit. This course teaches delegates with no prior programming or data analytics experience how to perform data manipulation, data analysis and data visualisation in R. Mastery of these techniques will allow delegates to immediately add value in their work place by extracting valuable insight from company data to allow better, data-driven decisions. The course will explore the following topics through a series of interactive workshop sessions: * What is R? * Basic R programming conventions * Data structures in R * Accessing data in R * Descriptive statistics in R * Statistical analysis in R * Data manipulation in R * Data visualisation in R ADDITIONAL COURSE DETAILS: Nexus Humans Beginning Data Analytics With R training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Beginning Data Analytics With R course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

Beginning Data Analytics With R
Delivered on-request, onlineDelivered Online
Price on Enquiry

Hands-on Predicitive Analytics with Python (TTPS4879)

By Nexus Human

Duration 3 Days 18 CPD hours This course is intended for This course is geared for Python experienced attendees who wish to learn and use basic machine learning algorithms and concepts. Students should have skills at least equivalent to the Python for Data Science courses we offer. Overview Working in a hands-on learning environment, guided by our expert team, attendees will learn to Understand the main concepts and principles of predictive analytics Use the Python data analytics ecosystem to implement end-to-end predictive analytics projects Explore advanced predictive modeling algorithms w with an emphasis on theory with intuitive explanations Learn to deploy a predictive model's results as an interactive application Learn about the stages involved in producing complete predictive analytics solutions Understand how to define a problem, propose a solution, and prepare a dataset Use visualizations to explore relationships and gain insights into the dataset Learn to build regression and classification models using scikit-learn Use Keras to build powerful neural network models that produce accurate predictions Learn to serve a model's predictions as a web application Predictive analytics is an applied field that employs a variety of quantitative methods using data to make predictions. It involves much more than just throwing data onto a computer to build a model. This course provides practical coverage to help you understand the most important concepts of predictive analytics. Using practical, step-by-step examples, we build predictive analytics solutions while using cutting-edge Python tools and packages. Hands-on Predictive Analytics with Python is a three-day, hands-on course that guides students through a step-by-step approach to defining problems and identifying relevant data. Students will learn how to perform data preparation, explore and visualize relationships, as well as build models, tune, evaluate, and deploy models. Each stage has relevant practical examples and efficient Python code. You will work with models such as KNN, Random Forests, and neural networks using the most important libraries in Python's data science stack: NumPy, Pandas, Matplotlib, Seabor, Keras, Dash, and so on. In addition to hands-on code examples, you will find intuitive explanations of the inner workings of the main techniques and algorithms used in predictive analytics. THE PREDICTIVE ANALYTICS PROCESS * Technical requirements * What is predictive analytics? * Reviewing important concepts of predictive analytics * The predictive analytics process * A quick tour of Python's data science stack PROBLEM UNDERSTANDING AND DATA PREPARATION * Technical requirements * Understanding the business problem and proposing a solution * Practical project ? diamond prices * Practical project ? credit card default DATASET UNDERSTANDING ? EXPLORATORY DATA ANALYSIS * Technical requirements * What is EDA? * Univariate EDA * Bivariate EDA * Introduction to graphical multivariate EDA PREDICTING NUMERICAL VALUES WITH MACHINE LEARNING * Technical requirements * Introduction to ML * Practical considerations before modeling * MLR * Lasso regression * KNN * Training versus testing error PREDICTING CATEGORIES WITH MACHINE LEARNING * Technical requirements * Classification tasks * Credit card default dataset * Logistic regression * Classification trees * Random forests * Training versus testing error * Multiclass classification * Naive Bayes classifiers INTRODUCING NEURAL NETS FOR PREDICTIVE ANALYTICS * Technical requirements * Introducing neural network models * Introducing TensorFlow and Keras * Regressing with neural networks * Classification with neural networks * The dark art of training neural networks MODEL EVALUATION * Technical requirements * Evaluation of regression models * Evaluation for classification models * The k-fold cross-validation MODEL TUNING AND IMPROVING PERFORMANCE * Technical requirements * Hyperparameter tuning * Improving performance IMPLEMENTING A MODEL WITH DASH * Technical requirements * Model communication and/or deployment phase * Introducing Dash * Implementing a predictive model as a web application ADDITIONAL COURSE DETAILS: Nexus Humans Hands-on Predicitive Analytics with Python (TTPS4879) training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Hands-on Predicitive Analytics with Python (TTPS4879) course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

Hands-on Predicitive Analytics with Python (TTPS4879)
Delivered on-request, onlineDelivered Online
Price on Enquiry

Advanced Analytics with Python

By Nexus Human

Duration 3 Days 18 CPD hours This course is intended for Before taking this course delegates should already be familiar with basic analytics techniques, comfortable with basic data manipulation tools such as spreadsheets and databases and already familiar with at least one programming language Overview This course teaches delegates who are already familiar with analytics techniques and at least one programming language how to effectively use the programming language for three tasks: data manipulation and preparation, statistical analysis and advanced analytics (including predictive modelling and segmentation). Mastery of these techniques will allow delegates to immediately add value in their work place by extracting valuable insight from company data to allow better, data-driven decisions. Outcomes: After completing the course, delegates will be capable of writing production-ready R code to perform advanced analytics tasks enabling their organisations make better, data-driven decisions. Becoming a world class data analytics practitioner requires mastery of the most sophisticated data analytics tools. These programming languages are some of the most powerful and flexible tools in the data analytics toolkit. TOPIC 1 * Intro to our chosen language TOPIC 2 * Basic programming conventions TOPIC 3 * Data structures TOPIC 4 * Accessing data TOPIC 5 * Descriptive statistics TOPIC 6 * Data visualisation TOPIC 7 * Statistical analysis TOPIC 8 * Advanced data manipulation TOPIC 9 * Advanced analytics ? predictive modelling TOPIC 10 * Advanced analytics ? segmentation

Advanced Analytics with Python
Delivered on-request, onlineDelivered Online
Price on Enquiry

Effective Data Visualisation

By Nexus Human

Duration 2 Days 12 CPD hours This course is intended for This course is aimed at anyone currently working with data who is interested in using data visualisation to more effectively communicate their results. Overview At completion, delegates will understand how data visualisations can be best used to communicate actionable insights from data and be competent with the tools required to do it. Visualising data, and analytics results, is one of the most effective ways to achieve this. This course will cover the theory of data visualisation along with practical skills for creating compelling visualisations from data. COURSE OUTLINE * The use of analytics, statistics and data science in business has grown massively in recent years. Harnessing the power of data is opening actionable insights in diverse industries from banking to horse breeding. The companies doing this most successfully understand that using sophisticated analytics approaches to unlock insights from data is only half the job. Communicating these insights to all of the different parts of an organisation is just as important as doing the actual analysis. Visualising data, and analytics results, is one of the most effective ways to achieve this. This course will cover the theory of data visualisation along with practical skills for creating compelling visualisations from data. To attend this course delegates should be competent in the use of data analysis tools such as reporting tools, spreadsheet software or business intelligence tools. The course will explore the following topics through a series of interactive workshop sessions: * Fundamentals of data visualisation * Data characteristics & dimensions * Mapping visual encodings to data dimensions * Colour theory * Graphical perception & communication * Interaction design * Visualisation different characteristics of data: trends, comparisons, correlations, maps, networks, hierarchies, text * Designing effective dashboards

Effective Data Visualisation
Delivered on-request, onlineDelivered Online
Price on Enquiry

From Data to Insights with Google Cloud Platform

By Nexus Human

Duration 3 Days 18 CPD hours This course is intended for Data Analysts, Business Analysts, Business Intelligence professionals Cloud Data Engineers who will be partnering with Data Analysts to build scalable data solutions on Google Cloud Platform Overview This course teaches students the following skills: Derive insights from data using the analysis and visualization tools on Google Cloud Platform Interactively query datasets using Google BigQuery Load, clean, and transform data at scale Visualize data using Google Data Studio and other third-party platforms Distinguish between exploratory and explanatory analytics and when to use each approach Explore new datasets and uncover hidden insights quickly and effectively Optimizing data models and queries for price and performance Want to know how to query and process petabytes of data in seconds? Curious about data analysis that scales automatically as your data grows? Welcome to the Data Insights course! This four-course accelerated online specialization teaches course participants how to derive insights through data analysis and visualization using the Google Cloud Platform. The courses feature interactive scenarios and hands-on labs where participants explore, mine, load, visualize, and extract insights from diverse Google BigQuery datasets. The courses also cover data loading, querying, schema modeling, optimizing performance, query pricing, and data visualization. This specialization is intended for the following participants: Data Analysts, Business Analysts, Business Intelligence professionals Cloud Data Engineers who will be partnering with Data Analysts to build scalable data solutions on Google Cloud Platform To get the most out of this specialization, we recommend participants have some proficiency with ANSI SQL. INTRODUCTION TO DATA ON THE GOOGLE CLOUD PLATFORM * Highlight Analytics Challenges Faced by Data Analysts * Compare Big Data On-Premises vs on the Cloud * Learn from Real-World Use Cases of Companies Transformed through Analytics on the Cloud * Navigate Google Cloud Platform Project Basics * Lab: Getting started with Google Cloud Platform BIG DATA TOOLS OVERVIEW * Walkthrough Data Analyst Tasks, Challenges, and Introduce Google Cloud Platform Data Tools * Demo: Analyze 10 Billion Records with Google BigQuery * Explore 9 Fundamental Google BigQuery Features * Compare GCP Tools for Analysts, Data Scientists, and Data Engineers * Lab: Exploring Datasets with Google BigQuery EXPLORING YOUR DATA WITH SQL * Compare Common Data Exploration Techniques * Learn How to Code High Quality Standard SQL * Explore Google BigQuery Public Datasets * Visualization Preview: Google Data Studio * Lab: Troubleshoot Common SQL Errors GOOGLE BIGQUERY PRICING * Walkthrough of a BigQuery Job * Calculate BigQuery Pricing: Storage, Querying, and Streaming Costs * Optimize Queries for Cost * Lab: Calculate Google BigQuery Pricing CLEANING AND TRANSFORMING YOUR DATA * Examine the 5 Principles of Dataset Integrity * Characterize Dataset Shape and Skew * Clean and Transform Data using SQL * Clean and Transform Data using a new UI: Introducing Cloud Dataprep * Lab: Explore and Shape Data with Cloud Dataprep STORING AND EXPORTING DATA * Compare Permanent vs Temporary Tables * Save and Export Query Results * Performance Preview: Query Cache * Lab: Creating new Permanent Tables INGESTING NEW DATASETS INTO GOOGLE BIGQUERY * Query from External Data Sources * Avoid Data Ingesting Pitfalls * Ingest New Data into Permanent Tables * Discuss Streaming Inserts * Lab: Ingesting and Querying New Datasets DATA VISUALIZATION * Overview of Data Visualization Principles * Exploratory vs Explanatory Analysis Approaches * Demo: Google Data Studio UI * Connect Google Data Studio to Google BigQuery * Lab: Exploring a Dataset in Google Data Studio JOINING AND MERGING DATASETS * Merge Historical Data Tables with UNION * Introduce Table Wildcards for Easy Merges * Review Data Schemas: Linking Data Across Multiple Tables * Walkthrough JOIN Examples and Pitfalls * Lab: Join and Union Data from Multiple Tables ADVANCED FUNCTIONS AND CLAUSES * Review SQL Case Statements * Introduce Analytical Window Functions * Safeguard Data with One-Way Field Encryption * Discuss Effective Sub-query and CTE design * Compare SQL and Javascript UDFs * Lab: Deriving Insights with Advanced SQL Functions SCHEMA DESIGN AND NESTED DATA STRUCTURES * Compare Google BigQuery vs Traditional RDBMS Data Architecture * Normalization vs Denormalization: Performance Tradeoffs * Schema Review: The Good, The Bad, and The Ugly * Arrays and Nested Data in Google BigQuery * Lab: Querying Nested and Repeated Data MORE VISUALIZATION WITH GOOGLE DATA STUDIO * Create Case Statements and Calculated Fields * Avoid Performance Pitfalls with Cache considerations * Share Dashboards and Discuss Data Access considerations OPTIMIZING FOR PERFORMANCE * Avoid Google BigQuery Performance Pitfalls * Prevent Hotspots in your Data * Diagnose Performance Issues with the Query Explanation map * Lab: Optimizing and Troubleshooting Query Performance ADVANCED INSIGHTS * Introducing Cloud Datalab * Cloud Datalab Notebooks and Cells * Benefits of Cloud Datalab DATA ACCESS * Compare IAM and BigQuery Dataset Roles * Avoid Access Pitfalls * Review Members, Roles, Organizations, Account Administration, and Service Accounts

From Data to Insights with Google Cloud Platform
Delivered on-request, onlineDelivered Online
Price on Enquiry

Data Warehousing on AWS

By Nexus Human

Duration 3 Days 18 CPD hours This course is intended for This course is intended for: Database architects Database administrators Database developers Data analysts and scientists Overview This course is designed to teach you how to: Discuss the core concepts of data warehousing, and the intersection between data warehousing and big data solutions Launch an Amazon Redshift cluster and use the components, features, and functionality to implement a data warehouse in the cloud Use other AWS data and analytic services, such as Amazon DynamoDB, Amazon EMR, Amazon Kinesis, and Amazon S3, to contribute to the data warehousing solution Architect the data warehouse Identify performance issues, optimize queries, and tune the database for better performance Use Amazon Redshift Spectrum to analyze data directly from an Amazon S3 bucket Use Amazon QuickSight to perform data analysis and visualization tasks against the data warehouse Data Warehousing on AWS introduces you to concepts, strategies, and best practices for designing a cloud-based data warehousing solution using Amazon Redshift, the petabyte-scale data warehouse in AWS. This course demonstrates how to collect, store, and prepare data for the data warehouse by using other AWS services such as Amazon DynamoDB, Amazon EMR, Amazon Kinesis, and Amazon S3. Additionally, this course demonstrates how to use Amazon QuickSight to perform analysis on your data MODULE 1: INTRODUCTION TO DATA WAREHOUSING * Relational databases * Data warehousing concepts * The intersection of data warehousing and big data * Overview of data management in AWS * Hands-on lab 1: Introduction to Amazon Redshift MODULE 2: INTRODUCTION TO AMAZON REDSHIFT * Conceptual overview * Real-world use cases * Hands-on lab 2: Launching an Amazon Redshift cluster MODULE 3: LAUNCHING CLUSTERS * Building the cluster * Connecting to the cluster * Controlling access * Database security * Load data * Hands-on lab 3: Optimizing database schemas MODULE 4: DESIGNING THE DATABASE SCHEMA * Schemas and data types * Columnar compression * Data distribution styles * Data sorting methods MODULE 5: IDENTIFYING DATA SOURCES * Data sources overview * Amazon S3 * Amazon DynamoDB * Amazon EMR * Amazon Kinesis Data Firehose * AWS Lambda Database Loader for Amazon Redshift * Hands-on lab 4: Loading real-time data into an Amazon Redshift database MODULE 6: LOADING DATA * Preparing Data * Loading data using COPY * Data Warehousing on AWS * AWS Classroom Training * Concurrent write operations * Troubleshooting load issues * Hands-on lab 5: Loading data with the COPY command MODULE 7: WRITING QUERIES AND TUNING FOR PERFORMANCE * Amazon Redshift SQL * User-Defined Functions (UDFs) * Factors that affect query performance * The EXPLAIN command and query plans * Workload Management (WLM) * Hands-on lab 6: Configuring workload management MODULE 8: AMAZON REDSHIFT SPECTRUM * Amazon Redshift Spectrum * Configuring data for Amazon Redshift Spectrum * Amazon Redshift Spectrum Queries * Hands-on lab 7: Using Amazon Redshift Spectrum MODULE 9: MAINTAINING CLUSTERS * Audit logging * Performance monitoring * Events and notifications * Lab 8: Auditing and monitoring clusters * Resizing clusters * Backing up and restoring clusters * Resource tagging and limits and constraints * Hands-on lab 9: Backing up, restoring and resizing clusters MODULE 10: ANALYZING AND VISUALIZING DATA * Power of visualizations * Building dashboards * Amazon QuickSight editions and feature

Data Warehousing on AWS
Delivered on-request, onlineDelivered Online
Price on Enquiry

Google Cloud Platform Big Data and Machine Learning Fundamentals

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for This class is intended for the following: Data analysts, Data scientists, Business analysts getting started with Google Cloud Platform. Individuals responsible for designing pipelines and architectures for data processing, creating and maintaining machine learning and statistical models, querying datasets, visualizing query results and creating reports. Executives and IT decision makers evaluating Google Cloud Platform for use by data scientists. Overview This course teaches students the following skills:Identify the purpose and value of the key Big Data and Machine Learning products in the Google Cloud Platform.Use Cloud SQL and Cloud Dataproc to migrate existing MySQL and Hadoop/Pig/Spark/Hive workloads to Google Cloud Platform.Employ BigQuery and Cloud Datalab to carry out interactive data analysis.Train and use a neural network using TensorFlow.Employ ML APIs.Choose between different data processing products on the Google Cloud Platform. This course introduces participants to the Big Data and Machine Learning capabilities of Google Cloud Platform (GCP). It provides a quick overview of the Google Cloud Platform and a deeper dive of the data processing capabilities. INTRODUCING GOOGLE CLOUD PLATFORM * Google Platform Fundamentals Overview. * Google Cloud Platform Big Data Products. * COMPUTE AND STORAGE FUNDAMENTALS * CPUs on demand (Compute Engine). * A global filesystem (Cloud Storage). * CloudShell. * Lab: Set up a Ingest-Transform-Publish data processing pipeline. * DATA ANALYTICS ON THE CLOUD * Stepping-stones to the cloud. * Cloud SQL: your SQL database on the cloud. * Lab: Importing data into CloudSQL and running queries. * Spark on Dataproc. * Lab: Machine Learning Recommendations with Spark on Dataproc. * SCALING DATA ANALYSIS * Fast random access. * Datalab. * BigQuery. * Lab: Build machine learning dataset. * MACHINE LEARNING * Machine Learning with TensorFlow. * Lab: Carry out ML with TensorFlow * Pre-built models for common needs. * Lab: Employ ML APIs. * DATA PROCESSING ARCHITECTURES * Message-oriented architectures with Pub/Sub. * Creating pipelines with Dataflow. * Reference architecture for real-time and batch data processing. * SUMMARY * Why GCP? * Where to go from here * Additional Resources ADDITIONAL COURSE DETAILS: Nexus Humans Google Cloud Platform Big Data and Machine Learning Fundamentals training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Google Cloud Platform Big Data and Machine Learning Fundamentals course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

Google Cloud Platform Big Data and Machine Learning Fundamentals
Delivered on-request, onlineDelivered Online
Price on Enquiry

Building Data Analytics Solutions Using Amazon Redshift

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for This course is intended for data warehouse engineers, data platform engineers, and architects and operators who build and manage data analytics pipelines. Completed either AWS Technical Essentials or Architecting on AWS Completed Building Data Lakes on AWS Overview In this course, you will learn to: Compare the features and benefits of data warehouses, data lakes, and modern data architectures Design and implement a data warehouse analytics solution Identify and apply appropriate techniques, including compression, to optimize data storage Select and deploy appropriate options to ingest, transform, and store data Choose the appropriate instance and node types, clusters, auto scaling, and network topology for a particular business use case Understand how data storage and processing affect the analysis and visualization mechanisms needed to gain actionable business insights Secure data at rest and in transit Monitor analytics workloads to identify and remediate problems Apply cost management best practices In this course, you will build a data analytics solution using Amazon Redshift, a cloud data warehouse service. The course focuses on the data collection, ingestion, cataloging, storage, and processing components of the analytics pipeline. You will learn to integrate Amazon Redshift with a data lake to support both analytics and machine learning workloads. You will also learn to apply security, performance, and cost management best practices to the operation of Amazon Redshift. MODULE A: OVERVIEW OF DATA ANALYTICS AND THE DATA PIPELINE * Data analytics use cases * Using the data pipeline for analytics MODULE 1: USING AMAZON REDSHIFT IN THE DATA ANALYTICS PIPELINE * Why Amazon Redshift for data warehousing? * Overview of Amazon Redshift MODULE 2: INTRODUCTION TO AMAZON REDSHIFT * Amazon Redshift architecture Interactive Demo 1: Touring the Amazon Redshift console Amazon Redshift features Practice Lab 1: Load and query data in an Amazon Redshift cluster MODULE 3: INGESTION AND STORAGE * Ingestion * Interactive Demo 2: Connecting your Amazon Redshift cluster using a Jupyter notebook with Data API * Data distribution and storage * Interactive Demo 3: Analyzing semi-structured data using the SUPER data type * Querying data in Amazon Redshift * Practice Lab 2: Data analytics using Amazon Redshift Spectrum MODULE 4: PROCESSING AND OPTIMIZING DATA * Data transformation * Advanced querying * Practice Lab 3: Data transformation and querying in Amazon Redshift * Resource management * Interactive Demo 4: Applying mixed workload management on Amazon Redshift * Automation and optimization * Interactive demo 5: Amazon Redshift cluster resizing from the dc2.large to ra3.xlplus cluster MODULE 5: SECURITY AND MONITORING OF AMAZON REDSHIFT CLUSTERS * Securing the Amazon Redshift cluster * Monitoring and troubleshooting Amazon Redshift clusters MODULE 6: DESIGNING DATA WAREHOUSE ANALYTICS SOLUTIONS * Data warehouse use case review * Activity: Designing a data warehouse analytics workflow MODULE B: DEVELOPING MODERN DATA ARCHITECTURES ON AWS * Modern data architectures

Building Data Analytics Solutions Using Amazon Redshift
Delivered on-request, onlineDelivered Online
Price on Enquiry

Building Data Lakes on AWS

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for This course is intended for: Data platform engineers Solutions architects IT professionals Overview In this course, you will learn to: Apply data lake methodologies in planning and designing a data lake Articulate the components and services required for building an AWS data lake Secure a data lake with appropriate permission Ingest, store, and transform data in a data lake Query, analyze, and visualize data within a data lake In this course, you will learn how to build an operational data lake that supports analysis of both structured and unstructured data. You will learn the components and functionality of the services involved in creating a data lake. You will use AWS Lake Formation to build a data lake, AWS Glue to build a data catalog, and Amazon Athena to analyze data. The course lectures and labs further your learning with the exploration of several common data lake architectures. MODULE 1: INTRODUCTION TO DATA LAKES * Describe the value of data lakes Compare data lakes and data warehouses Describe the components of a data lake Recognize common architectures built on data lakes MODULE 2: DATA INGESTION, CATALOGING, AND PREPARATION * Describe the relationship between data lake storage and data ingestion * Describe AWS Glue crawlers and how they are used to create a data catalog * Identify data formatting, partitioning, and compression for efficient storage and query * Lab 1: Set up a simple data lake MODULE 3: DATA PROCESSING AND ANALYTICS * Recognize how data processing applies to a data lake Use AWS Glue to process data within a data lake Describe how to use Amazon Athena to analyze data in a data lake MODULE 4: BUILDING A DATA LAKE WITH AWS LAKE FORMATION * Describe the features and benefits of AWS Lake Formation Use AWS Lake Formation to create a data lake Understand the AWS Lake Formation security model Lab 2: Build a data lake using AWS Lake Formation MODULE 5: ADDITIONAL LAKE FORMATION CONFIGURATIONS * Automate AWS Lake Formation using blueprints and workflows Apply security and access controls to AWS Lake Formation Match records with AWS Lake Formation FindMatches Visualize data with Amazon QuickSight Lab 3: Automate data lake creation using AWS Lake Formation blueprints Lab 4: Data visualization using Amazon QuickSight MODULE 6: ARCHITECTURE AND COURSE REVIEW * Post course knowledge check * Architecture review * Course review ADDITIONAL COURSE DETAILS: Nexus Humans Building Data Lakes on AWS training program is a workshop that presents an invigorating mix of sessions, lessons, and masterclasses meticulously crafted to propel your learning expedition forward. This immersive bootcamp-style experience boasts interactive lectures, hands-on labs, and collaborative hackathons, all strategically designed to fortify fundamental concepts. Guided by seasoned coaches, each session offers priceless insights and practical skills crucial for honing your expertise. Whether you're stepping into the realm of professional skills or a seasoned professional, this comprehensive course ensures you're equipped with the knowledge and prowess necessary for success. While we feel this is the best course for the Building Data Lakes on AWS course and one of our Top 10 we encourage you to read the course outline to make sure it is the right content for you. Additionally, private sessions, closed classes or dedicated events are available both live online and at our training centres in Dublin and London, as well as at your offices anywhere in the UK, Ireland or across EMEA.

Building Data Lakes on AWS
Delivered on-request, onlineDelivered Online
Price on Enquiry

Building Batch Data Analytics Solutions on AWS

By Nexus Human

Duration 1 Days 6 CPD hours This course is intended for This course is intended for: Data platform engineers Architects and operators who build and manage data analytics pipelines Overview In this course, you will learn to: Compare the features and benefits of data warehouses, data lakes, and modern data architectures Design and implement a batch data analytics solution Identify and apply appropriate techniques, including compression, to optimize data storage Select and deploy appropriate options to ingest, transform, and store data Choose the appropriate instance and node types, clusters, auto scaling, and network topology for a particular business use case Understand how data storage and processing affect the analysis and visualization mechanisms needed to gain actionable business insights Secure data at rest and in transit Monitor analytics workloads to identify and remediate problems Apply cost management best practices In this course, you will learn to build batch data analytics solutions using Amazon EMR, an enterprise-grade Apache Spark and Apache Hadoop managed service. You will learn how Amazon EMR integrates with open-source projects such as Apache Hive, Hue, and HBase, and with AWS services such as AWS Glue and AWS Lake Formation. The course addresses data collection, ingestion, cataloging, storage, and processing components in the context of Spark and Hadoop. You will learn to use EMR Notebooks to support both analytics and machine learning workloads. You will also learn to apply security, performance, and cost management best practices to the operation of Amazon EMR. MODULE A: OVERVIEW OF DATA ANALYTICS AND THE DATA PIPELINE * Data analytics use cases * Using the data pipeline for analytics MODULE 1: INTRODUCTION TO AMAZON EMR * Using Amazon EMR in analytics solutions * Amazon EMR cluster architecture * Interactive Demo 1: Launching an Amazon EMR cluster * Cost management strategies MODULE 2: DATA ANALYTICS PIPELINE USING AMAZON EMR: INGESTION AND STORAGE * Storage optimization with Amazon EMR * Data ingestion techniques MODULE 3: HIGH-PERFORMANCE BATCH DATA ANALYTICS USING APACHE SPARK ON AMAZON EMR * Apache Spark on Amazon EMR use cases * Why Apache Spark on Amazon EMR * Spark concepts * Interactive Demo 2: Connect to an EMR cluster and perform Scala commands using the Spark shell * Transformation, processing, and analytics * Using notebooks with Amazon EMR * Practice Lab 1: Low-latency data analytics using Apache Spark on Amazon EMR MODULE 4: PROCESSING AND ANALYZING BATCH DATA WITH AMAZON EMR AND APACHE HIVE * Using Amazon EMR with Hive to process batch data * Transformation, processing, and analytics * Practice Lab 2: Batch data processing using Amazon EMR with Hive * Introduction to Apache HBase on Amazon EMR MODULE 5: SERVERLESS DATA PROCESSING * Serverless data processing, transformation, and analytics * Using AWS Glue with Amazon EMR workloads * Practice Lab 3: Orchestrate data processing in Spark using AWS Step Functions MODULE 6: SECURITY AND MONITORING OF AMAZON EMR CLUSTERS * Securing EMR clusters * Interactive Demo 3: Client-side encryption with EMRFS * Monitoring and troubleshooting Amazon EMR clusters * Demo: Reviewing Apache Spark cluster history MODULE 7: DESIGNING BATCH DATA ANALYTICS SOLUTIONS * Batch data analytics use cases * Activity: Designing a batch data analytics workflow MODULE B: DEVELOPING MODERN DATA ARCHITECTURES ON AWS * Modern data architectures

Building Batch Data Analytics Solutions on AWS
Delivered on-request, onlineDelivered Online
Price on Enquiry
1...789

Educators matching "Big Data Analytics"

Show all 14
Whitehall Media

whitehall media

0.0(3)

Manchester

Founded in 2006, Whitehall Media delivers high quality, content-focused conference programmes that address high level strategic issues within the market places in which we operate. Our leading-edge events impart practical and technical information through visionary keynotes, interactive seminars and lively one-to-ones. We specialise in high value, difficult-to-engage markets and create events that merge buy-side and sell-side professionals in an innovative and strategic business exchange. We are leaders in the specialisms we cover and our conference led exhibitions are held in the UK, Europe and the UAE showcasing the latest ground-breaking trends, tools and technologies for both government and industry. The environment we develop enables best practice to be shared among peers and deals to be closed on the day. We meticulously market to decision-maker delegates to ensure that we attract buyers with spending and decision-making power who will give their valuable business or leisure time to every one of our events. If you decide to participate in a Whitehall Media event you can be sure you will be in the very best of hands. Our events are professionally managed and marketed strategically. We work tirelessly with our clients to evaluate their needs, offer them the best possible service and ensure our conferences deliver real value to all participating individuals and organisations. Ensuring a successful outcome for our customers is our utmost priority.

Duco Digital Training

duco digital training

5.0(12)

Redcar

Duco Digital Training [https://ducodigitaltraining.com/courses] is a trusted provider of BCS online accredited courses, boot camps and training in an exciting range of business and technology subjects, Artificial Intelligence (AI) & Machine Learning, [https://ducodigitaltraining.com/artificial-intelligence-courses] Business Analysis [https://ducodigitaltraining.com/business-analysis-courses], Data Protection [https://ducodigitaltraining.com/data-protection-courses], Data Analysis [https://ducodigitaltraining.com/data-analysis-courses], Digital Product Management [https://ducodigitaltraining.com/digital-product-management-course], IT Ethics [https://ducodigitaltraining.com/business-and-it-ethics-courses], Sales and Marketing [https://ducodigitaltraining.com/sales-and-marketing-courses], and Management [https://ducodigitaltraining.com/management-courses]. These range from short courses (awards), focused certifications at essential, foundation and practitioner levels, diplomas and bundles; designed to fit with career goals, your available time to learn and budget. As well as strengthening skills and knowledge in a current role, these industry-recognised qualifications are recognised in over 200 countries, and can also open up a range of exciting new opportunities with a free one-year membership to BCS which offers professional networking, CPD and career support when learners pass their exam with Duco Digital. We are committed to making learning as easy as possible. Our courses are designed so you can learn at home or work, without excessive reading or time-consuming assignments. Upgrade your skills and become indispensable to your company - enrol on a course today and begin your path to success!