- The Query
- Posts
- 🤓 Data Analysts: The PERFECT dataset for your next portfolio project
🤓 Data Analysts: The PERFECT dataset for your next portfolio project
Read Time: 3 minutes
GM crunchers! The Query here — the data newsletter that's like a neural net for your data career. We help you establish neural connections so you learn data analysis faster.
Here’s what we have for you today:
Learn Self-Joins in 10 minutes
Why ACS and Census data is perfect for portfolio projects 💻️
A cool function for doing Time Series analysis in SQL
Funnies, haha 😆
def learn_data_analysis(👨💻):
1. Self Joins: Take Your SQL to the Next Level
When I first saw a self-join, I was probably 6 months into learning data analysis.
My mind was blown 🤯
A self-join is exactly what it sounds like — joining a table onto itself.
When would you want to do this?
As with all things SQL, it’s best to see an example.
In this article by Haley Hamer, there are two examples of self-joins.
Spend a few minutes reading through them.
You won’t regret adding this to your arsenal of SQL tricks.
2. Ever wonder what goes on inside the mind of a Data Analyst hiring manager?…
You’re in luck if you have.
Trenton Huey is the Director of Data at Vida Health.
In this interview, he answers questions like:
What skills/qualities do you look for when interviewing candidates for a data analyst position?
What are common data analysis projects?
If you had to start over your career, what would you do differently?
Enjoy!
select * from dataset-of-the-week
If you struggle with deciding which datasets to use for portfolio projects, you’re not alone.
This is a common problem.
Here’s the solution — Just choose American Communities Survey (ACS) and Census data.
This data is used by many different companies to do things like:
Size total addressable markets
Determine the best geos to launch new products in
Tailor marketing messaging to various demos
And so much more…
The data is publically available and if you get familiar with it while you’re learning data analysis, this knowledge will bear fruits over the course of your career.
And some advice…
Don’t expect it to be easy to find the data you’re looking for (there are 1000s of different datasets). It’s not. But you’ll eventually figure it out if you put in the time.
Don’t expect the data to be perfect. This is good. It will give you real-world experience cleaning data and finding the nuances of a dataset.
class MiniLesson:
Time Series Analysis with LAG()
Time series analysis is one of the most important skills you can have as a data analyst.
This is because most important insights are related to a change in some metric over time.
The LAG() function in SQL is a window function that is helpful for this.
Using the LAG() function, you can calculate period-to-period changes, such as the difference in sales between consecutive months or the growth in user signups from week to week.
It can seem a bit confusing at first since it’s a window function but I promise you’ll get the hang of it.
Below I show how to use it to calculate a month-over-month change in sales column.
Let's consider an example where we have a table named monthly_sales with the following monthly sales data:
In this example, the LAG() function is first used within the OVER() clause, which defines the order of rows to be used in the calculation.
This retrieves the sales value from the previous row, relative to the current row, when sorted by the month column (previous_month_sales).
Then, we subtract the previous month's sales from the current month's sales to get the sales change.
Understanding how to use the LAG() function in SQL is essential, as it allows you to perform period-to-period comparisons and calculations within your queries.
By using the LAG() function, you can gain insights into how values change over time, helping you identify trends, growth rates, and patterns in your data.
import memes as 😂
That’s it for today.
Stay crunchin’ folks and see you next week!
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
What'd you think of today's newsletter? |