This is not a tutorial on how to scrape the web. This is something new I’m trying out. I’ve decided to document the whole process of creating my new “business”. And as you would think from the post’s title it is based on online gathered data. That’s why I’m gonna document it on this blog because I think some of you web scrapers are also interested in building a business around web scraping. This is I’ve been figuring out how to do and I am gonna share everything I experience while building it.
To be honest, I’m just a young dude and I know nothing about building a business or such. I just know how to scrape the web and wanna build something interesting with it. I have no idea what I’m doing but I try to figure it out. This blogpost series isn’t gonna be about what I think I would do or what I suppose would work. It’s gonna be about what I’m doing right now. I hope some people will find it interesting and get some value out of it.
Pricing Intelligence Platform
First of all, what kind of business I’m talking about. I’m building a pricing intelligence software. It helps ecommerce companies optimize their prices through competitor monitoring and analysis. If you have no clue what I’m talking about that’s fine. Some months ago I knew nothing about pricing intelligence softwares too but I’ve done a comprehensive research on the topic and read a bunch of ebooks and PDFs, watched videos about it. So now I understand how this kind of software works and why it is beneficial for ecommerce companies.
Scraping Prices from Competitor Sites
The heart of a pricing intelligence platform is online data. If you can’t gather online data in some way you can’t create a platform around it. Fortunately, I know how to fetch data from websites so I’ve been figuring out all the other stuff I need to know. So web scraping is the first step. It is the opener to create a pricing intelligence platform. What do I scrape? Primarily, I gather information about products’ prices. At the moment, I only fetch data from marketplace/price comparison websites. But later, I’m gonna gather pricing data directly from the competitors’ websites.
Crawling, Matching, Analytics
The software has 3 main modules. The first one is the crawling engine. It gathers information from marketplace/price comparison websites then cleans and standardizes the data. After all scraped data is sent to a structured database. The second module is the product matching engine. Its job is to match the web shop’s products with the competitor’s products so it’s possible to create price analysis. This module is not functioning yet because I gather information from marketplace websites and it already matches the same products for me. The third one is the analytics engine.This produces the visualization of the analysis. Ultimately, this module is seen and used by the user.
That’s it for now. It was just a brief intro what I’ve been working on lately. In the next post I will talk about how I “validated the idea” and get my first customer before having a functioning software.
Let me know your thoughts and questions in the comment section.