Zachary Waugh's Portfolio

Data Analyst skilled in SQL, Tableau, Python and Excel/Google Sheets @Zachary Waugh.

Background

My name is Zachary Waugh, I am 22 years old and have an interest in becoming a Data Analyst. I graduated in May 2021 with a Bachelor's of Science in Mathematics, also I have a background in Computer Science,Economics & Business. The skills that I possess, consists of: SQL, Tableau, Python, Excel/Google Sheets and Mathematics such as Calculus 1-3, Linear Algebra, Abstract Algebra, Real Analysis, Differential Equations, Number Theory, Game Theory, Trigonometry,Mathematical Modeling and Discrete Mathematics.

    Projects

Tableau Dashboards

These are dashboards that were created through the use of SQL & Excel/Google Sheets.

Contains Tableau Dashboards for Projects on Esports Earnings, Top 10 Most Profitable vs Least Profitable Movies of all time, AUDJPY FTP & TSL, AUDJPY Wins & Win Percentage Per Session, AUDJPY Confluences, Buys & Sells, Videogame Sales, COVID-19, NBA Productivity, Earthquake Magnitude & Depth, AirBnB Exploration, etc.

GitHub

Contains SQL Queries from SQL Server, Python Code from Jupyter Notebook & Datasets for every project that I have completed.

AUDJPY Exploration

Over the course of 1 1/2 months, 130 trades were taken. Throughout those 130 trades, total trades, wins, losses, win percentage, etc. were looked into for Fixed Take Profit (FTP) & Trailing Stop Loss (TSL) trades. Then a deeper dive was done for the minimum, maximum, average and total profit made from each confluence, as well as the number of times the confluence occured. AUDJPY Dataset is here.


AirBnB Exploration

Image a client wants to start an AirBnB business in Seattle, Washington. They want to know where the best place to buy a home is, as well as other factors such as location, number of bedrooms and how much they can charge customers (i.e. they want to be able to make the most profit per property.) AirBnB Dataset is here.


Data Cleaning In SQL

In this project, we use SQL Server to clean messy data, remove unwanted, duplicates or non important data in order to make the data/analysis easier to read and work with. Nashville Housing Dataset is here.


Movie Correlation In Python

In this Project, through the use of Python, we look at different variables and see which ones have an effect on the gross revenue from movies. Movies Dataset is here.


Exploring eBay Car Sales Data In Python

We'll work with a dataset of used cars from eBay Kleinanzeigen, a section of the German eBay website. The aim of this project is to clean the data and analyze the included used car listings. The dataset can be found here.


Exploring Hacker News Posts In Python

Hacker News is a site started by the startup incubator Y Combinator, where user-submitted stories (known as "posts") receive votes and comments, similar to reddit. Hacker News is extremely popular in technology and startup circles, and posts that make it to the top of the Hacker News listings can get hundreds of thousands of visitors as a result.

We're specifically interested in posts with titles that begin with either Ask HN or Show HN. Users submit Ask HN posts to ask the Hacker News community a specific question. Below are a few examples:

Ask HN: How to improve my personal website?

Ask HN: Am I the only one outraged by Twitter shutting down share counts?

Ask HN: Any recent changes to CSS that broke mobile?

Likewise, users submit Show HN posts to show the Hacker News community a project, product, or just something interesting. Below are a few examples:

Show HN: Wio Link ESP8266 Based Web of Things Hardware Development Platform'

Show HN: Something pointless I made

Show HN: Shanhu.io, a programming playground powered by e8vm

We'll compare these two types of posts to determine the following:

Do Ask HN or Show HN receive more comments on average?

Do posts created at a certain time receive more comments on average?

The dataset can be found here.


Finding Heavy Traffic Indicators on I-94 In Python

In this project I used two different indicators to determine the traffic volume of I-94, they consist of:

Time indicators:

The traffic is usually heavier during warm months (March–October) compared to cold months (November–February).

The traffic is usually heavier on business days compared to the weekends.

On business days, the rush hours are around 7 and 16.

Weather indicators:

Snow shower

Light rain and snow

Proximity thunderstorm with drizzle

The dataset can be found here.


Profitable App Profiles for the App Store & Google Play Markets In Python

For this project, Say we're working as a data analyst for a company that builds Android and iOS mobile apps. We make our apps available on Google Play and in the App Store. We only build apps that are free to download and install, and our main source of revenue consists of in-app ads. This means that the number of users of our apps determines our revenue for any given app (i.e. the more users who see and engage with the ads, the better.) Our goal for this project is to analyze data to help our developers understand what type of apps are likely to attract more users. The dataset for Android Apps from Google Play can be found here. and the dataset for iOS Apps from the App Store can be found here.


Analyzing NYC High School Data In Python

In this project we are seeing if SAT Score's are fair for Highschools in New York City because they have a significant immigrant population and are very diverse. Therefore, comparing demographic factors such as race, income, and gender with SAT scores is a good way to determine whether the SAT is a fair test. For example, if certain racial groups consistently perform better on the SAT, we would have some evidence that the SAT is unfair. The dataset can be found here.


Storytelling Data Visualization on Exchange Rates In Python

Our focus in the guided part of the project will be on the exchange rate between the Euro and the American dollar.

The dataset we'll use describes Euro daily exchange rates between 1999 and 2021. The Euro (symbolized with €) is the official currency in most of the countries of the European Union.

If the exchange rate of the Euro to the US dollar is 1.5, you get 1.5 US dollars if you pay 1.0 Euro (one Euro has more value than one US dollar at this exchange rate). The dataset can be found here.


Analyzing CIA Factbook Data Using SQL

In this guided project, we'll use SQL in Jupyter Notebook to analyze data from the CIA World Factbook database, which is a compendium of statistics about all of the countries on Earth. It contains demographic information such as: population, population_growth and area.


Investigating Fandango Movie Ratings In Python

In October 2015, a data journalist Walt Hickey conducted movie rating data analysis (published in this article) and found a significant discrepancy between the number of stars displayed to users and the actual rating in the HTML of the page, which can suggest that Fandango's rating system was biased and dishonest. According to his analysis, the actual rating was almost always rounded up to the nearest half-star, the actual half-star rating – up to the nearest whole star, and on one occasion, a movie rating was even rounded up by an entire star, from 4 to 5. He claimed that actual movie ratings were in the code of html documents that were read in by the browser. His work can be found here.

Fandango was made aware of the analysis and pointed to a bug in their code. The goal of this project is to look at Fandango movie ratings for more recent movies to see if the pattern has changed.