Cademy logoCademy Marketplace

Course Images

Data Analysis with Pandas and Python

🔥 Limited Time Offer 🔥

Get a 10% discount on your first order when you use this promo code at checkout: MAY24BAN3X

  • 30 Day Money Back Guarantee
  • Completion Certificate
  • 24/7 Technical Support

Highlights

  • On-Demand course

  • 19 hours 26 minutes

  • All levels

Description

This course offers an immersive experience in data analysis, guiding you from initial setup with Python and Pandas, through series and DataFrame manipulation, to advanced data visualization techniques. Perfect for enhancing your data handling and analysis skills.

This course begins with the essentials, introducing you to Anaconda and Jupyter Lab setup for Python and Pandas. You'll gain foundational knowledge in Python before diving into Pandas for data analysis. The focus then shifts to Series and DataFrame structures, providing you with the skills to manage and manipulate data effectively. Further, the course covers handling dates and times, and performing various file input and output operations, essential for real-world data analysis tasks. Advanced sections delve into data visualization using Matplotlib, enabling you to create impactful charts and graphs. You'll also explore advanced Pandas options and settings, enhancing your data manipulation capabilities. By the end of this course, you'll have a comprehensive understanding of data analysis techniques. You'll be equipped to handle complex datasets, perform detailed analysis, and present data visually, opening doors to advanced data analysis and manipulation in professional settings.

What You Will Learn

Understand and utilize Python's basic and advanced data types.
Master Series and DataFrame operations in Pandas.
Handle complex data types like dates and times.
Perform input and output operations with different file types.
Create and customize various types of data visualizations.
Optimize data analysis with advanced Pandas settings and functions.

Audience

Ideal for data analysts, aspiring data scientists, and professionals keen on mastering data manipulation and analysis. This course is a perfect fit for those with basic Python knowledge looking to delve deep into data analytics using Pandas. Whether you're aiming to enhance your skillset for professional growth or apply data analysis techniques in your current role, this course offers a comprehensive learning path from Python basics to advanced data handling and visualization techniques in Pandas.

Approach

The course adopts a hands-on, step-by-step approach, starting with Python and Pandas setup and progressing through data manipulation, visualization, and advanced features. It emphasizes practical application, with examples and exercises to solidify understanding and skill development.

Key Features

Detailed step-by-step guidance on the installation and setup of key data analysis tools. * Comprehensive coverage of Pandas functionalities, from basics to advanced techniques. * Focus on real-world applications, enhancing your ability to analyze complex datasets.

Github Repo

https://github.com/PacktPublishing/Data-Analysis-with-Pandas-and-Python

About the Author

Boris Paskhaver

Boris Paskhaver is a New York City-based software engineer, author, and Udemy instructor with a unique journey into tech. Graduating from NYU in 2013 with a degree in Business Economics and Marketing, he initially worked in various roles, including business analyst and data analyst, at several companies. His coding journey began accidentally while building projects with Python and JavaScript, leading him to passionately pursue programming. Without formal computer science education, Boris completed App Academy's full-stack web development bootcamp, diving headfirst into web development. As an instructor, Boris focuses on creating comprehensive, easy-to-understand courses, addressing the challenges he faced learning to code. He's driven by the intersection of technology and education, aiming to make programming accessible to all. Boris brings this passion to his teaching, helping others unlock the potential of coding.

Course Outline

1. Installation and Setup

This section sets the stage with an introduction to the course's framework, leading you through the Anaconda distribution installation for Python and R data science. It focuses on setting up the Python environment using Anaconda Navigator and familiarizes you with the Jupyter Lab interface. Key skills taught include executing code cells, importing libraries, and understanding start-up and shutdown processes in Python environments, forming a solid base for efficient data analysis.

1. Introduction to the Course

This video provides an overview of the course, setting expectations and outlining the learning journey ahead.

2. macOS - Download and Install the Anaconda Distribution

Learn how to download and install the Anaconda distribution, a popular Python and R data science platform, on macOS.

3. Windows - Download and Install the Anaconda Distribution

This tutorial guides you through the installation process of the Anaconda distribution on a Windows operating system.

4. Use Anaconda Navigator to Create a New Environment

Discover how to use Anaconda Navigator to create and manage your Python environments for data analysis.

5. Unpack Course Materials + The Startdown and Shutdown Process

Unpack the course materials provided and understand the startup and shutdown processes in your new Python environment.

6. Intro to the Jupyter Lab Interface

Get introduced to the Jupyter Lab interface, a key tool for Python programming and data analysis.

7. Code Cell Execution

Learn about executing code cells within Jupyter Lab, a fundamental aspect of interactive Python programming.

8. Import Libraries into Jupyter Lab

Understand how to import necessary Python libraries into Jupyter Lab for data analysis.


2. Python Crash Course

This section offers a comprehensive Python crash course. It covers the essentials of Python programming, including comments, basic data types, operators, variables, functions, string methods, and core data structures like lists and dictionaries. You'll also gain insight into Python classes and navigating libraries in Jupyter Lab, crucial for effective data management.

1. Comments

This video explains how to use comments in Python, an essential practice for making your code readable and maintainable.

2. Basic Data Types

Dive into Python's basic data types, foundational knowledge for any aspiring Python programmer in this session.

3. Operators

In this video, we will explore Python's operators to perform various operations on data.

4. Variables

Learn all about defining and using variables in Python, a key concept in programming in this video.

5. Built-in Functions

Discover Python's built-in functions that are readily available for various tasks in this tutorial.

6. Custom Functions

Understand how to create and use custom functions in Python to encapsulate reusable code logic.

7. String Methods

Explore various string methods in Python that are vital for text data manipulation.

8. Lists

Learn about lists in Python, a versatile data structure for storing collections of items.

9. Index Positions and Slicing

Master the techniques of indexing and slicing in Python to access and modify data in lists.

10. Dictionaries

Dive into dictionaries, a Python data structure that stores data in key-value pairs.

11. Classes

Get an introduction to Python classes, a fundamental concept of object-oriented programming.

12. Navigating Libraries using Jupyter Lab

Learn how to navigate and utilize various Python libraries within the Jupyter Lab environment.


3. Series

Focusing on Pandas Series, this section delves into creating and manipulating Series objects from lists and dictionaries. It introduces Series methods and attributes, covering topics like data import using `pd.read_csv`, element inspection with head and tail methods, value sorting, and inclusion checks. The section also covers advanced techniques like overwriting values, Series copying, math operations, broadcasting, and applying functions, vital for proficient data analysis in Pandas.

1. Create a Series Object from a List

Discover how to create a Pandas Series object from a Python list, a basic operation in data analysis.

2. Create a Series Object from a Dictionary

Learn to create a Pandas Series object from a dictionary, allowing for labeled data manipulation.

3. Intro to Series Methods

Get introduced to various methods available for Pandas Series objects, enhancing your data manipulation skills.

4. Intro to Attributes

Understand the attributes of Pandas Series, which provide information about the data structure.

5. Parameters and Arguments

Explore the parameters and arguments used in Pandas methods for more controlled data operations.

6. Import Series with the pd.read_csv Function

Learn how to import data into a Pandas Series using the pd.read_csv function, a common task in data analysis.

7. The head and tail Methods

Discover how to use the head and tail methods in Pandas to quickly inspect the beginning and end of a Series.

8. Passing Series to Python Built-In Functions

Understand how to pass Pandas Series objects to Python's built-in functions for various computations.

9. Check for Inclusion with Python's in Keyword

Learn to check for the inclusion of elements in a Pandas Series using Python's in keyword.

10. The sort_values Method

Master the sort_values method in Pandas to sort data in a Series.

11. The sort_index Method

Understand how to use the sort_index method to sort a Pandas Series by its index.

12. Extract Series Values by Index Position

Learn how to extract values from a Pandas Series based on their index positions.

13. Extract Series Values by Index Label

Discover how to retrieve data from a Pandas Series using index labels.

14. The get Method

Get familiar with the get method for safely retrieving values from a Pandas Series.

15. Overwrite a Series Value

Learn how to overwrite values in a Pandas Series, an essential skill for data cleaning.

16. The copy Method

Understand the importance of the copy method in Pandas for creating independent copies of data.

17. Math Methods on Series Objects

Explore various mathematical methods available on Pandas Series for data analysis.

18. Broadcasting

Get an introduction to broadcasting in Pandas, which allows for operations on all elements of a Series.

19. The value_counts Method

Learn about the value_counts method, a powerful tool for summarizing categorical data in a Series.

20. The apply Method

Discover how to use the apply method to apply a function to each element in a Series.

21. The map Method

Understand the map method in Pandas for transforming each element in a Series.


4. DataFrames I: Introduction

This section provides an introduction to Pandas DataFrames, beginning with a comparison between Series and DataFrames. It covers essential DataFrame manipulation techniques, including column selection, addition, and the value_counts method for data analysis. The section further delves into handling missing values through methods like row dropping and the fillna function. Additionally, it explores the astype method for data type conversion and concludes with in-depth sorting techniques, including value and index-based sorting and the rank method, laying a solid foundation in DataFrame operations.

1. Methods and Attributes between Series and DataFrames

This video compares the methods and attributes of Series and DataFrames, highlighting their similarities and differences.

2. Differences between Shared Methods

Explore the nuances and differences in shared methods between Series and DataFrames in Pandas.

3. Select One Column from a DataFrame

Learn how to select a single column from a DataFrame, a fundamental skill in data frame manipulation.

4. Select Multiple Columns from a DataFrame

Understand the techniques to select multiple columns from a DataFrame, enhancing your data analysis capabilities.

5. Add New Column to DataFrame

Discover how to add new columns to a DataFrame, a key step in enriching your data set.

6. A Review of the value_counts Method

Review of the value_counts method and its applications in analyzing DataFrame columns will be covered in this session.

7. Drop DataFrame Rows with Missing Values

Learn strategies to drop rows with missing values in a DataFrame, crucial for data cleaning.

8. Fill in Missing Values with the fillna Method

Master the use of the fillna method to fill in missing values in a DataFrame.

9. The astype Method I

This video introduces the astype method in Pandas, focusing on its basic usage for converting data types within a DataFrame.

10. The astype Method II

A deeper exploration into the astype method, covering more complex scenarios and best practices for type conversion in Pandas.

11. Sort a DataFrame with the sort_values Method I

Learn the fundamentals of the sort_values method, including sorting DataFrames based on one or more columns.

12. Sort a DataFrame with the sort_values Method II

Expands on the sort_values method, exploring advanced options and techniques for sorting DataFrames in Pandas.

13. Sort DataFrame with the sort_index Method

Learn how to sort DataFrames by their index using the sort_index method in this video

14. Rank Series Values with the rank Method

Explore the rank method to rank values within a DataFrame column.


5. DataFrames II: Filtering Data

In this section, we will dive into advanced data filtering within Pandas DataFrames, starting with dataset introduction and memory optimization. It extensively covers row filtering based on various conditions using logical operators AND (&) and OR (|), and the isin method for value-based filtering. We will also address handling null values with isnull and notnull, range filtering using between, and identifying duplicates with duplicated. It concludes by teaching how to remove duplicate rows using drop_duplicates and finding unique values, equipping learners with vital skills for efficient data cleaning and preparation in Pandas.

1. This Module's Dataset + Memory Optimization

Get introduced to the dataset used in this module and learn techniques for memory optimization.

2. Filter a DataFrame Based on a Condition

Learn to filter rows in a DataFrame based on specific conditions.

3. Filter with More than One Condition (AND - &)

Understand how to apply multiple filter conditions to a DataFrame using logical operators. We will focus on the AND-& operator in this video session.

4. Filter with More than One Condition (OR - |)

Understand how to apply multiple filter conditions to a DataFrame using logical operators. We will focus on the OR - | operator in this video session.

5. The isin Method

Discover the isin method to filter DataFrame rows based on a list of values.

6. The isnull and notnull Methods

Learn to identify and handle null values in a DataFrame using isnull and notnull.

7. The between Method

Explore the between method to filter DataFrame rows within a certain range.

8. The duplicated Method

Understand how to identify duplicated rows in a DataFrame using the duplicated method.

9. The drop_duplicates Method

Learn the process of removing duplicate rows from a DataFrame with drop_duplicates.

10. The unique and nunique Methods

Explore how to find unique values and count them in DataFrame columns.


6. DataFrames III: Data Extraction

Let's focus on mastering data extraction techniques in DataFrames in this section. It starts with dataset familiarization and delves into DataFrame structuring using set_index and reset_index. Key skills taught include retrieving data by index position or label using iloc and loc, overwriting data values, and renaming index labels or columns for clarity. The section also addresses deleting rows or columns, creating random samples, and extracting specific rows with nsmallest and nlargest. It culminates with conditional filtering and function application across DataFrame rows or columns, enhancing data manipulation proficiency.

1. This Module's Dataset

An introduction to the dataset used in this module for data extraction practices will be discussed in this session.

2. The set_index and reset_index Methods

Learn how to set and reset index in a DataFrame, an important aspect of DataFrame structuring.

3. Retrieve Rows by Index Position with iloc Accessor

Master retrieving rows by index position using the iloc accessor in Pandas.

4. Retrieve Rows by Index Label with loc Accessor

Discover how to access DataFrame rows by index labels using the loc accessor.

5. Second Arguments to loc and iloc Accessors

Understand the use of second arguments in loc and iloc for more precise data retrieval.

6. Overwrite Value in a DataFrame

This tutorial covers how to overwrite individual values in a DataFrame, essential for updating data entries or correcting errors.

7. Overwrite Multiple Values in a DataFrame

Explores techniques for bulk updating or modifying multiple values simultaneously in a DataFrame, crucial for efficient data management.

8. Rename Index Labels or Columns in a DataFrame

Discover how to rename index labels or columns in a DataFrame for better data clarity.

9. Delete Rows or Columns from a DataFrame

Understand how to delete rows or columns from a DataFrame, an essential data cleaning skill.

10. Create Random Sample with the sample Method

Learn to create a random sample from a DataFrame using the sample method.

11. The nsmallest and nlargest Methods

Explore the nsmallest and nlargest methods to extract specific rows based on column values.

12. Filtering with the where Method

Master the where method for conditional filtering in DataFrames.

13. The apply Method with DataFrames

Learn the use of the apply method to execute functions across DataFrame rows or columns.


7. Working with Text Data

This section is dedicated to handling text data in Pandas, beginning with dataset introduction and progressing to common string methods for text manipulation. It covers filtering DataFrame rows using string methods and applying these methods to DataFrame indices and columns. Learners will explore the split method, including its expand and n parameters, for detailed text analysis. The section provides practical exercises for applying these string methods, essential for those working with textual data in data analysis.

1. This Module's Dataset

Introduces the dataset used in this module, focusing on handling text data in Pandas.

2. Common String Methods

Demonstrates the use of common string methods in Pandas for text data manipulation.

3. Filtering with String Methods

Teaches how to filter DataFrame rows using string methods for text data analysis.

4. String Methods on Index and Columns

Explores the application of string methods on DataFrame indices and columns for data organization.

5. The split Method

Covers the split method to divide text data into multiple parts for detailed analysis.

6. More Practice with Splits

Provides additional exercises and examples to master the split method in various scenarios.

7. The expand and n Parameters of the split Method

Explains the use of expand and n parameters in the split method for customized text splitting.


8. MultiIndex

In this section, learners are introduced to the advanced concept of MultiIndex in Pandas, essential for complex data structuring. The section walks through creating a MultiIndex for sophisticated data grouping and teaches extracting values from different levels. It covers sorting MultiIndex DataFrames, renaming index labels, and methods like transpose, stack, and unstack for DataFrame transformation. The section concludes with an in-depth look at the pivot and melt methods and creating pivot tables, equipping learners with advanced data structuring techniques.

1. Intro to the MultiIndex Module

This video offers an introduction to MultiIndex in Pandas, setting the stage for complex data structuring.

2. Create a MultiIndex

Guides through creating a MultiIndex in Pandas, a technique for advanced data grouping.

3. Extract Index Level Values

Demonstrates how to extract values from different levels of a MultiIndex for detailed data analysis.

4. Rename Index Lebels

Teaches how to rename index labels in a MultiIndex DataFrame for clarity and readability.

5. The sort_index Method on a MultiIndex DataFrame

Shows how to sort a MultiIndex DataFrame using the sort_index method for organized data presentation.

6. Extract Rows from a MultiIndex DataFrame

Covers techniques to extract specific rows from a MultiIndex DataFrame, enhancing data access.

7. The transpose Method

Introduces the transpose method to rearrange DataFrame rows and columns for different perspectives.

8. The stack Method

Explains the stack method to reshape DataFrame columns into a MultiIndex on rows.

9. The unstack Method

Demonstrates the unstack method for pivoting level values from the index to the columns.

10. The pivot Method

This video covers the pivot method to reshape and reorganize data in a DataFrame based on column values.

11. The melt Method

Teaches the melt method for transforming DataFrames into a format with one or more identifier variables.

12. The pivot_table Method

Introduces the pivot_table method for creating a spreadsheet-style pivot table as a DataFrame.


9. GroupBy

This section explores Pandas' GroupBy functionality, essential for data aggregation. It begins with an overview and practical use of the groupby method for grouping data, then moves to retrieving specific groups and applying various aggregation methods. The section also covers grouping by multiple columns, using the agg method for complex operations, and concludes with techniques for iterating through group data, providing a thorough understanding of data grouping and aggregation.

1. Intro to the GroupBy Module

Provides an overview of the GroupBy functionality in Pandas, essential for aggregating data.

2. The groupby Method

Demonstrates the use of the groupby method for grouping data based on some criteria.

3. Retrieve A Group with the get_group Method

Shows how to retrieve a specific group from a grouped DataFrame using the get_group method.

4. Methods on the GroupBy Object

Explores various methods available on GroupBy objects for different types of data aggregation.

5. Grouping by Multiple Columns

Teaches how to group data by multiple columns for more complex data analyses.

6. The agg Method

Covers the agg method to apply one or more operations over the grouped data.

7. Iterating through Groups

Provides techniques for iterating through groups in a GroupBy object for individual data processing.


10. Merging DataFrames

This section is dedicated to DataFrame merging, a crucial aspect of dataset combination in Pandas. It introduces merging techniques, focusing on the pd.concat function for DataFrame concatenation. The section covers various join types, including left, inner, and full-outer joins, and parameter usage for precise column matching. It also explains merging DataFrames by indexes and concludes with the join method, offering a comprehensive guide to DataFrame merging strategies.

1. Intro to the Merging DataFrames Module

Provides an introduction to techniques for merging DataFrames, a key aspect of combining datasets in Pandas.

2. The pd.concat Function I

This is the first session on pd.concat. This video demonstrates how to use the pd.concat function to concatenate DataFrames along a particular axis.

3. The pd.concat Function II

This is the second session on pd.concat. This video demonstrates how to use the pd.concat function to concatenate DataFrames along a particular axis.

4. Left Joins

Explores the concept of left joins, showing how to merge DataFrames with a focus on keys from the left frame.

5. The left_on and right_on Parameters

Teaches the use of left_on and right_on parameters in DataFrame joins for specific column matching.

6. Inner Joins I

This is the first session where we will explore all about inner joins. This video covers inner joins, detailing how to combine DataFrames based on the intersection of keys.

7. Inner Joins II

This is the second session where we will explore all about inner joins. This video covers inner joins, detailing how to combine DataFrames based on the intersection of keys.

8. Full-Outer Joins

This video explains full-outer joins, a method to merge DataFrames including all keys from both frames.

9. Merging by Indexes with the left_index and right_index Parameters

This video shows how to merge DataFrames using indexes as keys with left_index and right_index parameters.

10. The join Method

Introduces the join method for merging DataFrames, a simpler alternative to using the merge function.


11. Working with Dates and Times

This section covers handling dates and times in Pandas, starting with an overview and a review of Python's datetime. It teaches using Timestamp and DatetimeIndex objects, creating date ranges with pd.date_range, and accessing date-time properties via the dt attribute. The section also delves into time-based arithmetic with DateOffset, specialized date offsets, and timedeltas, enhancing skills in managing time series data.

1. Intro to the Working with Dates and Times Module and Review of Python's datetime

Introduces the module on working with dates and times in Pandas, along with a review of Python's datetime.

2. The Timestamp and DatetimeIndex Objects

Covers the use of Timestamp and DatetimeIndex objects in Pandas for date-time data manipulation.

3. Create Range of Dates with pd.date_range Function

Demonstrates creating a range of dates using the pd.date_range function, crucial for time series data.

4. The dt Attribute

Explores the dt attribute, allowing access to date and time properties of a Pandas Series.

5. Selecting Rows from a DataFrame with DatetimeIndex

Teaches techniques for selecting rows based on date-time indexes in a DataFrame.

6. The DateOffset Object

Introduces the DateOffset object to perform time-based arithmetic operations on dates.

7. Specialized Date Offsets

Covers specialized date offsets in Pandas for handling complex date-time manipulations.

8. Timedeltas

Explains the concept of timedeltas in Pandas, used for representing durations of time.


12. Input and Output

This section provides a complete guide on Pandas' input and output operations. It demonstrates exporting DataFrames to CSV, a standard data sharing format, and guides on using the openpyxl library for Excel file interactions. The section shows practical ways to import and export Excel files in Pandas, essential for diverse data analysis tasks.

1. Intro to the Input and Output Module

Provides an overview of input and output operations in Pandas, essential for data exchange.

2. Export DataFrame to CSV File

Demonstrates how to export a DataFrame to a CSV file, a common data sharing format.

3. Install openpyxl Library to Read and Write Excel Files

Guides on installing the openpyxl library, enabling reading and writing Excel files in Pandas.

4. Import Excel File into pandas

Shows how to import data from an Excel file into Pandas, a frequent operation in data analysis.

5. Export Excel File from pandas

This video teaches the process of exporting Pandas DataFrames to Excel files, useful for data reporting and sharing.


13. Visualization

This section focuses on data visualization with Pandas and Matplotlib in Python. Starting with installing Matplotlib, it teaches creating dynamic visualizations, including static and interactive charts. The section introduces the `plot` method for basic line plots, followed by enhancing visuals using templates for improved aesthetics. Learners will explore crafting bar charts for comparative analysis and pie charts for proportional data representation. This section equips learners with practical skills to effectively convey data insights through visual storytelling.

1. Install matplotlib Library for Visualization

Guides on installing the Matplotlib library, a powerful tool for creating a wide range of static, animated, and interactive visualizations in Python.

2. The plot Method

Introduces the plot method in Pandas for basic line plots, enabling quick and easy visualization of data series.

3. Modifying Plot Aesthetics with Templates

Demonstrates how to modify plot aesthetics using templates to enhance the visual appeal of data representations.

4. Bar Charts

In this session we will cover the creation of bar charts in Pandas, useful for comparing different groups or tracking changes over time.

5. Pie Charts

This video teaches how to make pie charts, which are effective for showing proportions of a whole in a visually intuitive way.


14. Options and Settings

In this section, learners explore various options and settings in Pandas for customized data analysis. The section demonstrates altering Pandas settings using attributes and functions for a flexible analysis environment and discusses the precision option, crucial for accurate data display.

1. Introduction to the Options and Settings Module

This session provides an overview of various options and settings in Pandas, allowing for customization of Pandas' behavior and output.

2. Changing Options with Attributes

Shows how to change Pandas options using attributes, offering a way to adjust settings to suit different analysis needs.

3. Changing Options with Functions

Explores changing Pandas options using functions, which can provide more flexibility and control over the data analysis environment.

4. The precision Option

Discusses the precision option in Pandas, which controls the output display precision of floating-point numbers, important for data clarity and readability.


15. Conclusion

The final section summarizes the entire course, emphasizing the key concepts and techniques in data analysis with Pandas and Python. It consolidates the comprehensive skill set developed, preparing learners to effectively handle data analysis in real-world scenarios.

1. Conclusion

This video wraps up the course by summarizing key concepts and techniques learned, reinforcing the comprehensive skill set acquired in data analysis with Pandas and Python.

Course Content

  1. Data Analysis with Pandas and Python

About The Provider

Packt
Packt
Birmingham
Founded in 2004 in Birmingham, UK, Packt’s mission is to help the world put software to work in new ways, through the delivery of effective learning and i...
Read more about Packt

Tags

Reviews