What Role Does Web Scraping Play in the Fashion E-Commerce?

Anil Prajapati
3 min readMay 12, 2022

Web data extraction or web scraping in the form of data mining helps you to scrape data from websites as well as convert them to structured data for more analysis. Although there is ample data accessible online, getting data that is authentic, relevant, and present is not an easy job to do. To use them, a collection technique used is called web scraping.

In fact, the procedure involves consolidating applicable data from the given website and analyzing it.

The data extraction online could be done physically. Naturally, however, by automating the procedure, businesses can collect more data in much lesser time.

Examples of Data Scraping Method Uses

Web scraping represents itself as a valuable tool for businesses in diverse areas as well as for different requirements. The method can be utilized, for instance, to get access to the industry statistics, produce leads as well as to conduct market research. Let’s see a few examples of the uses for commercial objectives.

  • Data Science & Data Analytics: Collection of Machine Learning training data as well as enterprise database improvement.
  • Established Communication: Gather news about the business.
  • Finance: Financial data
  • Sales & Marketing: Price comparison, SEO, product description searching, lead generation, consumer sentiment monitoring, and website testing.
  • Strategy: Market research.

Study Objective

  • Gather data from a fashion e-commerce website.
  • Implement the knowledge of the web scraping method using Requests, and BeautifulSoup libraries.
  • Gather all data including product name, product id, and product price.
  • Compare pricing for a particular product category.

Company Studied

The fashion e-commerce website studied was H&M, which is a Swedish multinational fashion company available in 74 markets having over 5000 stores.

The business model of H&M is Quality and Fashion at the Best Prices, in a Maintainable Way. Along with its key brand, the H&M Group owns Monki, COS, Weekday, Arket, Afound, & Other Stories, as well as Sellpy brands.

Importando as bibliotecas from bs4 import BeautifulSoup import requests import pandas as pd from datetime import datetime import numpy as np # url a ser feita o web scrapping #url = 'https://www2.hm.com/en_us/men/products/jeans.html' # Aqui mudei para pegar todos os produtos de todas as páginas url = 'https://www2.hm.com/en_us/men/products/jeans.html?sort=stock&image-size=small&image=model&offset=0&page-size=102' # Quem faz a requisição é um browser headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5),AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'} # Guardando a resposta dessa requisição em uma variável page = requests.get( url, headers=headers ) soup = BeautifulSoup (page.text,'html.parser') # Pegando todas as informações da vitrine do site products = soup.find('ul', class_ = 'products-listing small') product_list = products.find_all( 'article', class_ = 'hm-product-item') product_list [3].get('data-articlecode') # product id product_id= [p.get('data-articlecode') for p in product_list] product_id # product category product_category= [p.get('data-category') for p in product_list] product_category #product_name product_list = products.find_all( 'a', class_='link' ) product_name = [p.get_text() for p in product_list] # price product_list = products.find_all ("span", class_ = "price regular") product_price = [p.get_text() for p in product_list] product_price # inserindo em um dataframe no pandas data = pd.DataFrame([product_id, product_category, product_name, product_price]).T data.columns = ['product_id', 'product_category', 'product_name', 'product_price'] # Hora que foi realizado o scrapy data["scrapy_datetime"] = datetime.now().strftime('%Y-%m-%d %H:%M:%S' ) # Removendo o caracter $ da coluna product_price data['product_price'] = data['product_price'].str.replace('$','')

Final Code Results

Conclusion

Having data given in the CSV format, you can easily do an examining analysis of data to recognize which products experienced price changes as per the time and day the web scraping method was executed.

For Fashion e-commerce data scraping, contact X-Byte Enterprise Crawling or ask for a free quote!

Originally published at https://www.xbyte.io.

--

--