Quotes2Insights

Quotes2Insights is a project that demonstrates the end-to-end process of extracting, cleaning, and analyzing web data. In this project, we scraped quotes from the website Quotes to Scrape, stored the data in CSV files, derived insights using SQL, and performed exploratory data analysis and visualizations using Python.

Project Overview & Objective:

The objective of this project is to show how raw web data can be transformed into actionable insights. We accomplished this by:

Web Scraping: Extracting quotes, authors, and tags from the website.
SQL Insights: Loading the scraped data into a SQL database and answering key analytical questions (e.g., most quoted authors, top tags).
Exploratory Data Analysis & Visualization: Using Python (Pandas, Matplotlib, Seaborn, and WordCloud) to analyze and visualize the data. This process demonstrates the full pipeline—from data collection, cleaning, and normalization to drawing meaningful insights—crucial for data-driven decision-making.

Files in the Repository:

Web Scrapping.ipynb: Contains the Python script that scrapes the website, extracts the author, quote, and tag names, cleans the quote text, and saves the raw data as a CSV file (quotes_data.csv).
quotes_data.csv: The CSV file generated from the web scraping process. It contains the raw scraped data with columns for the author, quote, and tag names.
SQL_Insights.sql: Contains SQL queries and commands used to load the data into a SQL database, create normalized tables, and derive insights such as the number of quotes per author, top tags, and other key statistics.
cleaned_dataset.csv: A cleaned version of the dataset generated in the EDA phase. In this file, missing values have been imputed and the data has been prepared for further analysis and visualization.
EDA & Visualizations.ipynb: A Jupyter Notebook that performs exploratory data analysis and generates visualizations—including bar charts, word clouds, and pie charts—using the cleaned dataset.

How to Run the Project:

Web Scraping: Open and run Web Scrapping.ipynb in Jupyter Notebook to scrape the website and generate quotes_data.csv.
SQL Insights: Load the dataset into your SQL database (e.g., MySQL) using the commands in SQL_Insights.sql to create normalized tables and derive insights.
EDA & Visualizations: Open and run EDA & Visualizations.ipynb in Jupyter Notebook. In this notebook, missing values are imputed and the cleaned data is saved as cleaned_dataset.csv for further analysis and visualization.

Project Team:

Neha Prasad: Project Overview & Web Scraping
Nisha Kumari: SQL Insights
Madduri Vinay Kumar: Exploratory Data Analysis & Data Visualization

Dependencies:

Python 3.x
Jupyter Notebook
Libraries: pandas, requests, BeautifulSoup, matplotlib, seaborn, wordcloud, and others as required.
SQL Database: MySQL (or a similar database) for running SQL queries.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quotes2Insights

Project Overview & Objective:

Files in the Repository:

How to Run the Project:

Project Team:

Dependencies:

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
EDA & Visualizations.ipynb		EDA & Visualizations.ipynb
README.md		README.md
SQL_Insights.sql		SQL_Insights.sql
Web Scrapping.ipynb		Web Scrapping.ipynb
cleaned_dataset.csv		cleaned_dataset.csv
quotes_data.csv		quotes_data.csv

vinaykumar2331/Quotes2Insights

Folders and files

Latest commit

History

Repository files navigation

Quotes2Insights

Project Overview & Objective:

Files in the Repository:

How to Run the Project:

Project Team:

Dependencies:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages