Background

With over 3 years of experience in turning complex data into actionable insights, I bring a strong foundation in Mathematics, alongside expertise in SQL, Python, Tableau, Power BI, and Snowflake. My work is driven by a passion for solving business challenges, automating processes, and delivering data-driven strategies that support strategic decision-making.

I specialize in data preparation, ETL processes, and the creation of dynamic dashboards to empower teams and streamline operations. Throughout my career, I’ve consistently enhanced data workflows, resulting in improved reporting efficiency and stronger alignment between business objectives and data insights.

In my previous role, I was responsible for building automated reporting systems, streamlining data pipelines, and creating customized dashboards that empowered stakeholders to make informed decisions faster. My ability to communicate complex data findings clearly and collaborate effectively with cross-functional teams made me a key player in driving data-driven initiatives.

I am now seeking new opportunities where I can leverage my technical and business expertise to help organizations unlock the full potential of their data.

Projects

Tableau Dashboards

These are dashboards that were created through the use of SQL & Excel/Google Sheets.

Contains Tableau Dashboards for Projects on Esports Earnings, Top 10 Most Profitable vs Least Profitable Movies of all time, AUDJPY FTP & TSL, AUDJPY Wins & Win Percentage Per Session, AUDJPY Confluences, Buys & Sells, Videogame Sales, COVID-19, NBA Productivity, Earthquake Magnitude & Depth, AirBnB Exploration, etc.

View

GitHub

Contains SQL Queries from SQL Server, Python Code from Jupyter Notebook & Datasets for every project that I have completed.

View

Top 20 Most Productive NBA Players In The Last 10 Years

Most productive players based on the productivity statistics throughout the NBA such as:

Points Per Game (PPG), Rebounds (REB), Assists (AST), Blocks (BLK), Steals (STL) etc. NBA Dataset is here.

Top 10 Most Profitable vs Least Profitable Movies of all time

The most profitable vs least profitable movies and the exploration of statistics across many different areas, such as title, genre, director, writer, etc. from 1980-2020. Movies Dataset is here.

AirBnB Exploration

Image a client wants to start an AirBnB business in Seattle, Washington. They want to know where the best place to buy a home is, as well as other factors such as location, number of bedrooms and how much they can charge customers (i.e. they want to be able to make the most profit per property.) AirBnB Dataset is here.

Esports Earnings

Total and Average Earnings for game, country & genre across 100 Esports players in 10 different games. Eports Earnings Dataset is here.

AUDJPY Exploration

Over the course of 1 1/2 months, 130 trades were taken. Throughout those 130 trades, total trades, wins, losses, win percentage, etc. were looked into for Fixed Take Profit (FTP) & Trailing Stop Loss (TSL) trades. Then a deeper dive was done for the minimum, maximum, average and total profit made from each confluence, as well as the number of times the confluence occured. AUDJPY Dataset is here.

SQL Queries

Total AUDJPY Trades, Buys, Sells & Confluences

Throughout the 130 trades, we see how many trades were taken, number of buys and sells, number of wins and losses, as well as the type and number of confluences per each trade. AUDJPY Dataset is here.

AUDJPY Wins & Win Percentage Per Session

Total Wins & Win Percentage across all 130 trades Per each session, for both Fixed Take Profit (FTP) & Trailing Stop Loss (TSL) trades, as well as Wins & Win Percentage combined. AUDJPY Dataset is here.

AUDJPY Fixed Take Profit (FTP) & Trailing Stop Loss (TSL)

Comparing the Confluences Profit for Fixed Take Profit (FTP) & Trailing Stop Loss (TSL) trades across each session individually. Also, the profit made from each confluence across every session combined. AUDJPY Dataset is here.

Earthquake Magnitude & Depth

An Exploration of the Magnitude scale and Depth for 23,119 Earthquakes that occured from 1969-2018. Where the Magnitude is measured in Moment Magnitude (Mw) and the Depth is measured in Kilometers (Km). Earthquake Dataset is here.

Videogame Exploration for Playstation, XBOX & PC

Playstation, XBOX & PC Global Sales (In Millions & Billions) for Games, Platforms, Genres and total number of games for Platforms, Publishers, Genres and Years from the span of 1980-2020. Videogame Sales Dataset is here.

COVID-19 Exploration

Through the use of SQL Server, we analyze and explore the global statistics of COVID-19, such as Deaths & Vaccinations. COVID-19 Dataset is here.

Data Cleaning In SQL

In this project, we use SQL Server to clean messy data, remove unwanted, duplicates or non important data in order to make the data/analysis easier to read and work with. Nashville Housing Dataset is here.

SQL Queries

Amazon Web Scraping In Python

In this Project, we use Python in order to scrape/obtain data from Amazon in order to analyze the price data for different products.

View

Movie Correlation In Python

In this Project, through the use of Python, we look at different variables and see which ones have an effect on the gross revenue from movies. Movies Dataset is here.

View

Clean & Analyze Employee Exit Surveys In Python

Stakeholders want to know:

1. Are employees who only worked at institutions for a short period of time resigning due to some kind of dissatisfaction? What about employees who have been there longer?

2. Are younger employees resigning due to some kind of dissatisfaction?

3. What about older employees?

DETE Exit Survey Dataset and TAFE Exit Survey Dataset

View

Exploring eBay Car Sales Data In Python

We'll work with a dataset of used cars from eBay Kleinanzeigen, a section of the German eBay website. The aim of this project is to clean the data and analyze the included used car listings. The dataset can be found here.

View

Exploring Hacker News Posts In Python

Hacker News is a site started by the startup incubator Y Combinator, where user-submitted stories (known as "posts") receive votes and comments, similar to reddit. Hacker News is extremely popular in technology and startup circles, and posts that make it to the top of the Hacker News listings can get hundreds of thousands of visitors as a result.

We're specifically interested in posts with titles that begin with either Ask HN or Show HN. Users submit Ask HN posts to ask the Hacker News community a specific question. Below are a few examples:

Ask HN: How to improve my personal website?

Ask HN: Am I the only one outraged by Twitter shutting down share counts?

Ask HN: Any recent changes to CSS that broke mobile?

Likewise, users submit Show HN posts to show the Hacker News community a project, product, or just something interesting. Below are a few examples:

Show HN: Wio Link ESP8266 Based Web of Things Hardware Development Platform'

Show HN: Something pointless I made

Show HN: Shanhu.io, a programming playground powered by e8vm

We'll compare these two types of posts to determine the following:

Do Ask HN or Show HN receive more comments on average?

Do posts created at a certain time receive more comments on average?

The dataset can be found here.

View

Finding Heavy Traffic Indicators on I-94 In Python

In this project I used two different indicators to determine the traffic volume of I-94, they consist of:

Time indicators:

The traffic is usually heavier during warm months (March–October) compared to cold months (November–February).

The traffic is usually heavier on business days compared to the weekends.

On business days, the rush hours are around 7 and 16.

Weather indicators:

Snow shower

Light rain and snow

Proximity thunderstorm with drizzle

The dataset can be found here.

View

Profitable App Profiles for the App Store & Google Play Markets In Python

For this project, Say we're working as a data analyst for a company that builds Android and iOS mobile apps. We make our apps available on Google Play and in the App Store. We only build apps that are free to download and install, and our main source of revenue consists of in-app ads. This means that the number of users of our apps determines our revenue for any given app (i.e. the more users who see and engage with the ads, the better.) Our goal for this project is to analyze data to help our developers understand what type of apps are likely to attract more users. The dataset for Android Apps from Google Play can be found here. and the dataset for iOS Apps from the App Store can be found here.

View

Analyzing NYC High School Data In Python

In this project we are seeing if SAT Score's are fair for Highschools in New York City because they have a significant immigrant population and are very diverse. Therefore, comparing demographic factors such as race, income, and gender with SAT scores is a good way to determine whether the SAT is a fair test. For example, if certain racial groups consistently perform better on the SAT, we would have some evidence that the SAT is unfair. The dataset can be found here.

View

Storytelling Data Visualization on Exchange Rates In Python

Our focus in the guided part of the project will be on the exchange rate between the Euro and the American dollar.

The dataset we'll use describes Euro daily exchange rates between 1999 and 2021. The Euro (symbolized with €) is the official currency in most of the countries of the European Union.

If the exchange rate of the Euro to the US dollar is 1.5, you get 1.5 US dollars if you pay 1.0 Euro (one Euro has more value than one US dollar at this exchange rate). The dataset can be found here.

View

Analyzing CIA Factbook Data Using SQL

In this guided project, we'll use SQL in Jupyter Notebook to analyze data from the CIA World Factbook database, which is a compendium of statistics about all of the countries on Earth. It contains demographic information such as: population, population_growth and area.

View

Investigating Fandango Movie Ratings In Python

In October 2015, a data journalist Walt Hickey conducted movie rating data analysis (published in this article) and found a significant discrepancy between the number of stars displayed to users and the actual rating in the HTML of the page, which can suggest that Fandango's rating system was biased and dishonest. According to his analysis, the actual rating was almost always rounded up to the nearest half-star, the actual half-star rating – up to the nearest whole star, and on one occasion, a movie rating was even rounded up by an entire star, from 4 to 5. He claimed that actual movie ratings were in the code of html documents that were read in by the browser. His work can be found here.

Fandango was made aware of the analysis and pointed to a bug in their code. The goal of this project is to look at Fandango movie ratings for more recent movies to see if the pattern has changed.

View

Cleaning & Analyzing the Star Wars Survey In Python

Before the arrival of Star Wars: The Force Awakens in 2015, the FiveThirtyEight team surveyed 835 respondents using SurveyMonkey on Star Wars to answer the following question: "does the rest of America realize that “The Empire Strikes Back” is clearly the best of the bunch?". You can find the Github repo here.

In this project, we will be cleaning and exploring the data of a Star Wars Survey

View

Zachary Waugh | Data Analyst

Projects