• The Query
  • Posts
  • 🤓 4 best books for data analysts

🤓 4 best books for data analysts

Read Time: 4 minutes

Hey crunchers! The Query here — the data analyst newsletter that's like a pivot table for your data career, summarizing and analyzing data to reveal hidden trends and insights.

Here’s what we have for you today:

  • The best books for data analysts 📖 

  • What it actually means for a business to be “data-driven”

  • A way to “QUALIFY” your SQL statements

  • Joy in the form of memes 🤣 

This post contains affiliate links. We may earn a commission for purchases made at no additional cost to you.

def learn_data_analysis(👨‍💻):

1. The BEST books to supplement your learning as a data analyst 📚️ 

When I’m learning something new, I like using books as supplementary learning resources.

I usually only pick them up in the evenings, right before bed.

When picking up a new skill like data analysis, 80% of your time should be “learning by doing” (i.e. building projects).

But when the day is over and you’re winding down, picking up a related book can help keep your brain focused on data and speed up the learning process.

Here are my personal 4 favorite books related to data analytics:

2. What it means to be a “data-driven” organization

What does it mean for an organization to be data-driven?

If you’re someone that wants to work in data, this is something you MUST understand.

Here’s a series of 3 posts by George Xing that does an excellent job of explaining the nuances of this:

select * from dataset-of-the-week

This week’s dataset is an awesome portfolio project dataset.

It’s HUGE.

The UNESCO Institute of Statistics collects country-level data on the number of teachers, teacher-to-student ratios, and related figures.

Here’s what I recommend: Take 20 minutes and explore the various datasets available using the data explorer. Using the data available, come up with a few interesting questions related to the data. Then use whatever analytics workflow you’re comfortable with.

If it were me, I would:

  • Import the raw data to BigQuery

  • Use SQL to transform the data into the format I need for analysis

  • Complete the analysis with Tableau Public or Looker Studio (both free)

class MiniLesson:

QUALIFY Statement in SQL

The QUALIFY statement in SQL is a handy tool for making your queries more concise and readable.

I use it all the time when I’m writing SQL in BigQuery

It allows you to filter the results of a query based on the result of a window function, such as ROW_NUMBER().

This can be particularly useful when you want to select a single row for each group based on specific criteria, without the need to write a separate Common Table Expression (CTE) or subquery.

Let's consider a sales dataset with the following columns:

order_id, customer_id, product_id, sale_date, and sale_amount

Suppose you want to find the most recent purchase for each customer.

You can use QUALIFY along with ROW_NUMBER() to achieve this without using a CTE or subquery.

Here's an example of how to do this with and without QUALIFY, so you can see the benefit:

In this query, the ROW_NUMBER() window function assigns a row number for each row within each group of customer_id, with the row number 1 assigned to the most recent purchase (based on sale_date).

The QUALIFY statement then filters the results to only include rows with a row number of 1, effectively returning the most recent purchase for each customer.

The benefits of using QUALIFY with ROW_NUMBER() include:

  • Simplifying your query: Using QUALIFY can help you avoid writing complex subqueries or CTEs, making your query easier to understand and maintain.

  • Improving performance: Since QUALIFY filters the results of the window function directly, it can lead to better performance by reducing the amount of data that needs to be processed in subsequent steps.

  • Enhancing readability: By removing the need for nested subqueries or CTEs, QUALIFY can make your query more readable, allowing you and your colleagues to understand it more easily.

The QUALIFY statement in SQL can be a powerful tool for data analysts, simplifying queries and improving overall readability and performance.

Remember QUALIFY next time you are working with window functions!

import memes as 😂 

Me Singing to ChatGPT: “You raise me upppp…”

That’s it for today.

Stay crunchin’ folks and see you next week!

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

What'd you think of today's newsletter?

Login or Subscribe to participate in polls.