7 ways to level up your SQL queries

As data professionals, we spend countless hours writing, debugging, and optimizing SQL queries. Whether you're crafting dashboards, building pipelines, or simply digging into data, your SQL code can often mean the difference between an insightful analysis and an unreadable mess.

But here's the good news: writing cleaner, faster, and more maintainable SQL doesn’t require any black magic. In this post, I’ll share seven tips to elevate your SQL skills and help you write queries that are not only efficient but also easy to revisit without giving you a headache.

Use CTEs instead of subqueries

One common habit I see among people new to SQL is relying heavily on subqueries. While subqueries get the job done, they often make the code difficult to read - especially when nested subqueries are involved.

A much cleaner alternative is using CTEs (Common Table Expressions). Defined before the main SELECT statement, CTEs can simplify your queries by breaking them into logical, reusable chunks. They can also be referenced by other CTEs or the final SELECT statement, making your code both easier to read and maintain.

SQL query using subquery

SQL query is using subquery in the join

SQL query using CTE

SQL query joins on CTE

Use the QUALIFY clause

If you want to deduplicate your dataset or filter based on window functions for other purposes, you would typically create a subquery or, preferably, a CTE. In this approach, you define the window function in one CTE and then filter on it in the next CTE or the final SELECT statement.

The QUALIFY clause simplifies this process significantly. It allows you to filter directly on the results of a window function within the same SELECT statement, much like you would use a WHERE clause.

SQL query doing ranking in CTE

SQL query that uses CTE to calculate ranking

SQL query using qualify

SQL query using QUALIFY

Use COALESCE instead of CASE

The CASE statement is a powerful tool in SQL, but it's often overused in scenarios where simpler alternatives exist. One common example is handling NULL values, where overly complex CASE expressions are frequently written unnecessarily.

For NULL checks, the COALESCE function is a much cleaner and more efficient solution. It returns the first non-NULL value from a list, making your code both simpler and easier to read.

SQL query using CASE for null check

SQL query using CASE for null checks

SQL query using COALESCE for null checks

SQL query using COALESCE

Use DECODE instead of CASE or REPLACE

In SQL programming, there are often situations where you need to map certain values in a column and replace them with others. This frequently results in either massive CASE statements or the clumsy REPLACE(REPLACE(REPLACE(...)) pattern.

To avoid this repetitive syntax, you can use the DECODE function instead. It provides a cleaner and more concise way to transform column values using conditional logic.

Disclaimer: In many cases, it’s better to store the mapping in a dedicated mapping table rather than embedding the mapping logic directly in your code.

SQL query using REPLACE for mapping

SQL query using DECODE for mapping

SQL query using DECODE for mapping

Avoid OR in WHERE clauses

One thing I often find confusing during code reviews is the use of OR statements in WHERE clauses, especially when they’re mixed with or followed by other conditions. These can quickly make the logic harder to read and understand. In many cases, you can simplify such queries by using the IN statement instead.

SQL query using OR in WHERE clause

SQL query using IN in WHERE clause

Use a final CTE

This tip is particularly useful for production code, such as data pipelines or dbt models, where you’re creating tables that will persist in a database or data warehouse.

In these scenarios, it’s helpful to include a final CTE that explicitly defines all columns with the correct names and data types. This approach provides clarity about the expected columns and their formats, even in queries spanning hundreds of lines. It can save you significant time and frustration when revisiting the SQL query later.

SQL query using a final CTE

SQL query using a final CTE

Use a linter to standardize your formatting

Everyone has their own style of writing SQL—even your own code can look drastically different depending on when you wrote it. This inconsistency can make code reviews or revisiting your own work unnecessarily confusing.

That’s why using a SQL linter like sqlfluff is a great idea. With sqlfluff, you can not only lint your code but also automatically fix and standardize formatting, ensuring a consistent and readable style across your projects.

And the good news: It is open-source and very easy to setup using pip.

Wrapping things up

Writing clean, efficient, and maintainable SQL is an essential skill for any data professional. By incorporating these seven tips into your workflow you can drastically improve the readability and performance of your queries.

Remember, the goal isn’t just to write queries that work but to create code that you and others can easily understand and maintain. Whether you’re building pipelines, crafting reports, or tackling massive datasets, these practices will save you time, reduce frustration, and elevate the quality of your SQL.

Zurück
Zurück

Das böse Wort Self-Service - und wie es dennoch funktionieren kann