Cademy logoCademy Marketplace

Course Images

Web Scraping Tutorial with Scrapy and Python for Beginners

Web Scraping Tutorial with Scrapy and Python for Beginners

  • 30 Day Money Back Guarantee
  • Completion Certificate
  • 24/7 Technical Support

Highlights

  • On-Demand course

  • 7 hours 36 minutes

  • All levels

Description

Assuming that you don't know anything about web scraping, Scrapy, Python, web scraping, or even the meaning of web scraping-the author starts from the complete basics. A well-balanced course with theory and practical content followed by three projects at the end ensures you have the right skills to learn scraping.

Web scraping is the process of scraping websites and extracting desired data from them, and in this course, you will learn and master web scraping using Python and Scrapy with a step-by-step and in-depth guide. The course starts with introducing you to the web scraping process (with infographics-no code); learn how to scrape data from websites and how to use Scrapy for this. After getting the basics clear, you will perform web scraping using Python and the Scrapy framework! After you have built an actual web scraper, you will get an idea of how web scraping works first-hand. You will then look at the essential concepts of web scraping and Scrapy. Learning how to scrape websites and the essentials already makes you a complete web scraper but you will take this even further and learn the advanced web scraping techniques to become an expert. Advanced topics such as crawling multiple pages and extracting data-pagination, scraping data using Regular Expressions (RegEx), scraping dynamic or JavaScript-rendered websites using Scrapy Playwright-will be thoroughly explained. Finally, you will perform three projects at the end-Champions League Table [ESPN], Product Tracker [Amazon], and Scraper Application [GUI]. By the end of this course, you will have learned how to do web scraping using Python and Scrapy. All the resource files are added to the GitHub repository at: https://github.com/PacktPublishing/Web-Scraping-Tutorial-with-Scrapy-and-Python-for-Beginners-

What You Will Learn

Send a request to a URL to scrape websites using Scrapy Spider
Get the HTML Response from the URL and parse it for web scraping
Use Scrapy shell commands to test and verify CSS Selectors or XPath
Export and save scraped data to online databases such as MongoDB
Scrape data from multiple web pages using Scrapy pagination
Login to websites using Scrapy FormRequest with CSRF tokens

Audience

This course is ideal for beginner Python developers who want to master web scraping or freelance web scrapers looking to polish their skills. Any individual and college students working on their projects and wanting to master web scraping using Python and the Scrapy module, then this course is for you. A basic understanding of Python programming is a must and elementary-level knowledge of HTML basics will be a plus but not mandatory.

Approach

This course is carefully divided into four parts and each video is comprised of bite-sized lessons. The first part focuses on a step-by-step understanding of web scraping from scratch. The second part focuses on the essentials of web scraping and Scrapy. The third part focuses on mastering web scraping and the final part focuses on completing three live real-world projects.

Key Features

A well-balanced and structured course with practical projects at the end * Scrape Champions League Table [ESPN], Product Tracker [Amazon], and build Scraper Application [GUI] * Bite-sized videos and comes bundled with all the requisite materials

Github Repo

https://github.com/PacktPublishing/Web-Scraping-Tutorial-with-Scrapy-and-Python-for-Beginners-

About the Author

Rahul Mula

Rahul Mula is a passionate developer with expertise in Python, Flutter, and web development. He was really intrigued the first time he learned about programming and realized what could be done with it. Rahul thrives on exploring diverse technologies and crafting innovative applications. He's the mastermind behind Keyviz, a remarkable open-source tool for real-time keystroke visualization. Rahul's contributions extend to the realm of education, where he has authored books and crafted courses on Python programming, benefiting thousands of eager learners.

Course Outline

1. Introduction to the Course

1. What Is Web Scraping

This video provides an overview about web scraping.

2. How Web Scraping Works

This video explains how web scraping works.

3. Web Scraping with Scrapy

This video demonstrates the basic idea of how to do web scraping with Scrapy.


2. Scrapy Installation

1. Scrapy Installation for Windows

This video helps you with Scrapy Installation for Windows.

2. Scrapy Installation for Ubuntu (Linux)

This video explains how to install Scrapy on Ubuntu (Linux).

3. Creating Scrapy Project

This video demonstrates creating a Scrapy project.

4. Project Walkthrough

This video walks you through the project.


3. Scrapy Spider

1. Creating Spider

This video helps in creating the Spider.

2. Sending Request

This video demonstrates sending a request.

3. Getting the Response

This video explains how to get the response.

4. Scrapy CSS Selector

This video explains about the Scrapy CSS Selector.

5. Selecting All the Data

This video helps in selecting all the data.

6. Extracting Data

This video helps in extracting the data.

7. Spider Overview

This video provides you with the Spider overview.


4. CSS Selectors

1. CSS Selectors Versus XPath: How to Select Web Elements

This video talks about CSS Selectors versus XPath: how to select web elements.

2. Tagname, Class, and Id Selectors

This video demonstrates the Tagname, Class, and Id Selectors.

3. Attribute Selectors

This video explains about the attribute Selectors.


5. XPath

1. XPath Expressions

This video talks about the XPath expressions.

2. XPath Attribute Selectors

This video explains about the XPath attribute selectors.

3. XPath text( ) Function

This video explains about the XPath text( ) function.


6. Scrapy Shell

1. What Is the Scrapy Shell and How to Use It?

This video talks about the Scrapy Shell and explains how to use it.

2. fetch( ) Response

This video explains the fetch( ) response.

3. Shell Configuration

This video explains Shell configuration.


7. Scrapy Items

1. Structuring Data into Scrapy Item

This video talks about structuring data into a Scrapy item.

2. Using Item in Spiders

This video is about using an item in Spiders.

3. Define Input and Output Processors for Item Fields

This video helps define input and output processors for item fields.

4. Loading Items with Scrapy ItemLoaders

This video helps in loading items with Scrapy ItemLoaders.

5. Items, Processors, and ItemLoaders Overview

This video explains about items, processors, and an overview of ItemLoaders.


8. Exporting Data

1. Output Extracted Data in JSON, CSV, and XML Formats

This video helps in outputting extracted data in JSON, CSV, and XML formats.

2. Overwrite Previous Output

This video explains how to overwrite the previous output.

3. Appending Data to Previous Output

This video explains appending data to the previous output.


9. Scrapy Item Pipeline

1. How to Use Scrapy Item Pipelines

This video demonstrates how to use Scrapy item pipelines.

2. Saving Data Locally to Excel ( XLSX ) Files

This video demonstrates saving data locally to Excel ( XLSX ) files.

3. Enable Item Pipelines in Settings

This video demonstrates enabling item pipelines in settings.

4. MongoDB (Account) Setup

This video demonstrates the setup of MongoDB (Account).

5. Saving Data to MongoDB

This video demonstrates saving data to MongoDB.


10. Pagination

1. Extracting Links from href Attributes

This video explains extracting links from href attributes.

2. Send Request to the Next Page

This video helps in sending requests to the next page.

3. start_requests( ) Method

This video explains the start_requests( ) method.


11. Following Links

1. How to Follow Links

This video demonstrates how to follow links.

2. How to Select Data Using Regular Expressions with Scrapy

This video demonstrates how to select data using regular expressions with Scrapy.

3. Setting Up Custom Callback Function

This video demonstrates setting up a custom callback function.

4. Parse Product Details Page

This video talks about parsing the product details page.


12. Scraping Tables

1. HTML Tables

This video talks about the HTML tables.

2. Selecting Tables Data

This video explains about selecting tables data.

3. Extract Data from HTML Tables

This video demonstrates how to extract data from HTML tables.


13. Logging into Websites

1. Data Hidden with Logging Forms

This video talks about the data hidden with logging forms.

2. Inspecting HTML Forms and Website Activity with Dev Tools

This video helps in inspecting HTML forms and website activity with Dev Tools.

3. Logging into Websites with FormRequest

This video explains logging into websites with FormRequest.

4. CSRF Protected Login Forms

This video explains the CSRF protected login forms.

5. Extract CSRF Values from Forms

This video explains how to extract CSRF values from forms.


14. Scraping JavaScript Rendered Websites

1. What Are JavaScript Rendered/Dynamic Websites?

This video explains about the JavaScript rendered/dynamic websites.

2. scrapy-playwright Installation

This video demonstrates the installation of scrapy-playwright.

3. Setting Up Playwright in Scrapy Project

This video explains setting up Playwright in the Scrapy project.

4. Using Playwright to Render Websites

This video helps in using Playwright to render websites.

5. Scraping Data from Dynamic Websites

This video explains scraping data from dynamic websites.


15. Scrapy Playwright

1. Playwright Overview

This video provides an overview of Playwright.

2. Playwright Page Object

This video talks about the Playwright page object.

3. Logging in with Playwright

This video explains logging in with Playwright.

4. Dynamic Websites with Loading Screens

This video explains dynamic websites with loading screens.

5. Wait for Selector/Elements Using Page Coroutines

This video explains how to wait for selector/elements using page coroutines.

6. Dynamic Websites with Infinite Scroll

This video explains dynamic websites with infinite scroll.

7. Taking Screenshot of Websites

This video helps in taking a screenshot of websites.

8. Rendering Websites to PDF

This video shows how to render websites to PDF.


16. API Endpoints

1. Identifying API Calls

This video helps in identifying API calls.

2. Requesting Data from API

This video helps in requesting data from an API.

3. Extracting Data from API

This video explains extracting data from an API.


17. Settings

1. Scrapy Project Settings

This video is about the Scrapy project settings.

2. Robots Text

This video talks about Robots text.

3. Middleware

This video explains about the middleware.

4. Autothrottle Extension

This video talks about the Autothrottle extension.


18. User Agents and Proxies

1. What Are User Agents?

This video talks about user agents.

2. User Agents with Scrapy

This video explains user agents with Scrapy.

3. What Are Proxies?

This video explains about Proxies.

4. Proxies with Scrapy

This video talks about Proxies with Scrapy.


19. Tips and Tricks

1. Spider Arguments

This video demonstrates Spider arguments.

2. Standalone Spiders

This video demonstrates standalone Spiders.

3. Scrapy Shell with bpython

This video is about the Scrapy shell with bpython.

4. Scrapy Get Versus Extract Method

This video demonstrates the Scrapy get versus extract methods.

5. Logging

This video talks about logging.


20. Project 1: Champions League Table from ESPN.com

1. Overview

This video provides an overview of the project.

2. Website Visual Inspection

This video demonstrates the website visual inspection.

3. Finding the Selectors

This video explains finding the selectors.

4. Building the Spider: Extract Teams Data

This video explains building the spider: extract teams data.

5. Building the Spider: Extract Teams Details

This video explains building the spider: extracting teams details.


21. Project 2: Amazon Product Rank

1. Overview

This video provides an overview of the project.

2. Scraper Visualization

This video helps with Scraper visualization.

3. Finding the Selectors

This video helps in finding the selectors.

4. Building the Spider

This video helps in building the spider.


22. Project 3: Extending Scraper with GUI

1. Scraper Application

This video explains the scraper application.

2. Building the GUI (Application Interface)

This video helps in building the GUI (application interface).

3. Running the Spider from the Application

This video helps in running the spider from the application.

Course Content

  1. Web Scraping Tutorial with Scrapy and Python for Beginners

About The Provider

Packt
Packt
Birmingham
Founded in 2004 in Birmingham, UK, Packt’s mission is to help the world put software to work in new ways, through the delivery of effective learning and i...
Read more about Packt

Tags

Reviews