SQL Pool in Azure

Kumar Preeti Lata
5 min readOct 3, 2024

Why did the data engineer bring a ladder to work?
Because they wanted to work on a “high-level” query!

Okay, now that we’ve got that out of the way, let’s roll up our sleeves and get into the magic of SQL pools in Azure Synapse. Whether you’re here to climb new data heights or just need a hand navigating the confusing world of SQL pools, you’re in the right place!

What’s a SQL Pool Anyway?

Think of a SQL pool like a very sophisticated database engine — on steroids. But not the questionable type, the kind that actually helps you tackle large-scale data queries like a boss. In the world of Azure Synapse Analytics, SQL pools come in two flavors: Serverless and Dedicated. Each has its strengths, depending on your needs, your budget, and (let’s be real) your patience for optimization.

Dedicated SQL Pools: Your Workhorse

A dedicated SQL pool is like a weightlifting champ who can haul massive amounts of data. It’s always ready to lift (i.e., process queries), and you’ve got full control over its power. You provision it, scale it up, scale it down — just like upgrading from a mini-fridge to a walk-in freezer when your data grows.

How Dedicated SQL Pools Work

In a dedicated SQL pool, compute resources are always on stand-by for you. You define how much computing power you want ahead of time, and it stays reserved for you, making query execution fast and efficient. This is your go-to if you have predictable workloads, or if your datasets are too beefy to play nice with a serverless solution.

  • Storage and Compute Separation: In a dedicated pool, storage and compute are separated, so you can keep data stored without paying for compute when it’s not in use.
  • Data Distribution: Data is distributed across several nodes in your dedicated pool. This enables you to perform parallel processing on those monstrous queries that would otherwise slow down a standard database.

Fun fact: Dedicated SQL pools were formerly called “SQL Data Warehouses,” so if you’re ever in a data engineering time machine, you know what to call them!

Serverless SQL Pools: The Pay-as-You-Go Hero

If dedicated SQL pools are your trusty workhorse, then serverless SQL pools are more like that gig worker who shows up only when you need them. No long-term commitment — just pay for the queries you run.

How Serverless SQL Pools Work

With serverless SQL pools, you don’t provision anything ahead of time. When you run a query, Azure allocates the necessary resources, and you’re charged based on the amount of data processed by that query. It’s super cost-effective for ad-hoc workloads or exploring your data without needing to commit to infrastructure.

  • No Resource Provisioning: Unlike dedicated pools, there’s no infrastructure to manage. Just submit a query, and let Azure handle the heavy lifting.
  • Cost Efficiency: You only pay for the data scanned by your queries, making it a solid choice for less frequent or unpredictable workloads.

Imagine it as having access to a virtual SQL ninja who appears when summoned. You never see them, but when you ask for a query to be sliced and diced, they handle it with precision, disappearing back into the cloud once the job is done.

Joke break!
Why did the DBA switch to serverless?
Because they wanted their queries to be as light as their infrastructure bills!

External Tables: Let’s Talk Shop

Okay, so now that you know about serverless and dedicated pools, let’s chat about External Tables. They’re kind of like inviting guests over to your house, but instead of making them move in, they get to stay at their place, and you still have access to everything they bring.

What Are External Tables?

External tables allow you to query data that’s sitting outside of your SQL pool, like in Azure Data Lake, without needing to load it into the pool. It’s like keeping your data in its original location but making it available for querying as if it’s part of your SQL environment.

Think of it like a neighborhood BBQ. You don’t need everyone to bring their food into your house to enjoy it — you can just grill right outside, and everyone’s still having a good time. Similarly, external tables let you work with your data without importing it into your database.

How External Tables Work in Serverless Pools

One of the beauties of serverless SQL pools is how easily they integrate with external tables. You can query your data directly from data lakes, whether it’s stored in Parquet, CSV, or JSON format.

For example, if you have logs sitting in Azure Blob Storage, you can use a serverless pool to query that data using T-SQL, and bam! You’ve got your results, no data ingestion required.

Pro tip: Always double-check your external data sources and formats when using external tables. The last thing you want is to discover halfway through a query that your CSV file has “comma-tastrophes” where fields are randomly split.

When to Use What?

Now, here comes the million-dollar question: When should you use a serverless pool, a dedicated pool, or external tables? Don’t worry, I’ve got your back:

  • Use Dedicated SQL Pools when you have large, predictable workloads and need fast query performance on structured data. These pools are great for managing OLAP (Online Analytical Processing) workloads, where performance and high availability are critical.
  • Use Serverless SQL Pools for on-demand, ad-hoc querying, and when you don’t want to worry about infrastructure. Serverless pools are also fantastic for integrating with external tables to query data in Azure Data Lake without ingesting it.
  • Use External Tables when you need access to large datasets that are too costly or unnecessary to move into your SQL pool. External tables let you treat outside data sources as if they’re part of your database, without the overhead of importing them.

Scaling Your SQL Pools

One of the best things about SQL pools in Azure Synapse is that scaling is as easy as sliding a bar. With dedicated pools, you can scale up or down based on your workload’s needs, ensuring that your queries run faster when needed and saving money during off-peak times.

Meanwhile, serverless pools are auto-scaling by nature. Azure will dynamically allocate the right amount of compute resources based on the complexity of your queries.

Wrapping Up with a Query

SQL pools, whether serverless or dedicated, give you the flexibility to manage your data workloads with ease. Whether you’re running batch jobs in a dedicated pool or querying external data on the fly in a serverless pool, Azure Synapse Analytics has got you covered.

Last joke!
Why don’t data engineers tell jokes in SQL?
Because they’d never join properly! (ba dum tss)

And that’s a wrap! Now, go forth and conquer those data pipelines with the right SQL pool at your side!

--

--

Kumar Preeti Lata
Kumar Preeti Lata

Written by Kumar Preeti Lata

Seasoned Data Professional with an appetite to learn more and know more. I write about everything I find interesting under the sky !! #rinfinityplus

No responses yet