How Selenium and BeautifulSoup are Used for Scraping UberEATS Data?

Understanding the Website Structure

By signing into an Uber account, we can search for two pages that will provide us trip information. It is possible to find statements for every week of any month.

Now, if we download the statements, we can download CSV files that will possess unique trip ids for all the trips.

It is also possible to scrape the data from other weblinks: https://drivers.uber.com/p3/fleet-manager/trips. If we click on a particular trip

This page, as can be seen, has precise pick-up and drop-off locations, but the cost, surge, and total received are absent. For this blog, we’ll refer to this data scraping technique as “Uber Fleet Manager Scrapper.”

The plan is to extract information from both resources and then integrate it using duration and distance values that are similar. This will be discussed in more detail in the future blog.

Executing Selenium for Google Chrome

We must now configure Selenium for Google Chrome. The first step is to download Chrome Installer and save it somewhere safe. This link will take us to a page where we may get the appropriate Chrome Driver. We’ll use the code to install Selenium because we’re utilizing Jupyter Notebook and the Anaconda package management.

After that, we’ll utilize Selenium to launch Google Chrome. The following snippet can assist us in doing so.

This code should introduce a new Google Chrome window with google.com loaded. We are now ready to begin our scrapping activities. Please do not exit the Google Chrome window.

Scrapping Uber Trip Information

We need to download reports for all of the weeks we want to extract site surveys from before we can start scraping. To download all of the statements, click the link. Place most of the file system into a new folder called “Statements” once it’s finished.

You can now open a trip page using any trip ID using the below-given link: https://drivers.uber.com/p3/payments/trips/{your_trip_id}.

However, the uber website requires us to check-in before we can proceed. Use the steps below to log in.

For further verification, you can use the below-mentioned code:

We’ve now completed our registration process, and the trip id webpage should now be displayed.

Now we must develop the code that will collect information from all the trips. Our code must be written in the following order.

  • First, we must put all the statements for all of the weeks in a folder called “Statements.”
  • The script must then isolate the column Trip ID across all CSV files and build a link for each visit.
  • Finally, the script must access the website address, scrape information from it, and save the information.

The code below can be used to accomplish all of the above tasks:

The following link will take you to a full Jupyter notebook implementation: Uber Trip Scraper is a program that extracts data from Uber trips. You should have a scraped dataset that looks like this.

Extracting Uber Fleet Manager Information

  • We can open the website by clicking the link to extract information from Fleet Manager. The following functions must be performed by our code:
  • To expand the tables, click on all of the trips.
  • Fetch all of the information.
  • Apply the code to all the pages in a given week.

However, there is a significant amount of manual labor required. We need to go to a specific week manually. After that, we can begin the scrapping procedure. The following code can be used to do all of the required tasks.

We are always interested to deliver quality data scraped content. If you feel any queries, feel free to contact us.

Originally published at https://www.xbyte.io.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store