SQL cell
Last updated
Last updated
SQL is a first-class citizen in DataLab, with a powerful query engine to query databases, data warehouses, data frames and CSV files. With autocompletion, a full-fledged schema browser and SQL interpolation, DataLab brings all the SQL features you need to power your analyses.
Through the ability to query DataFrames and CSV files with SQL code, you can easily switch from SQL to Python/R and back. Through the 'Query mode' you can break up SQL queries riddled with common table expressions (CTEs) into chains of SQL cells, which makes things easier to debug and reuse.
Click "Add SQL cell" at the bottom of a notebook file (or in between two cells).
In the first dropdown in the header of the SQL cell, select which data source you want to query:
To run a query against a remote data warehouse or database, select it from the list. If you have a SQL database that you want to connect to, see Connect your data to DataLab If you don't have a SQL database to connect to, you can select one of the sample databases.
To run a query against a local DataFrame or CSV file inside your workbook, select "DataFrames and CSVs".
In the second dropdown in the SQL cell, decide how the SQL cell should run:
DataFrame mode: the full result of the SQL query will be stored as a DataFrame with a name you can specify.
Query mode: only a preview of the SQL query will show. The query is saved with a name that you can refer to in subsequent SQL queries.
(Optional) In the input field behind 'available as', specify the name of the data frame or the query (depending on your choice in the previous step).
Write the SQL query.
Click "Run" in the cell menu (or use the shortcuts). Depending on the mode you selected, you'll have a DataFrame to continue your analysis or a query that you can reference in a next SQL cell.
The different types of data sources to query and the different modes of querying enable powerful workflows that are useful in different combinations. Learn more about different ways of using the SQL cell in SQL scenarios.
Finally, there are times you want to dynamically update your SQL queries based on results of previous calculations or other data in your notebook. If this rings a bell, learn how canParameterize your SQL query.