DataLab Docs
  • What is DataLab?
  • Work
    • Creating a workbook
    • Sharing a workbook
    • Managing a workbook
    • Code cell
      • Working with packages
    • Text cell
      • Including images
    • SQL cell
      • SQL scenarios
      • Parameterize your SQL query
    • Explore Data cell
    • Chart cell
      • Configuring your chart
      • Pivot charts
      • Migration guide
    • AI Assistant
    • Version history
    • Scheduled runs
    • Hiding and showing cells
    • Long-running cells
    • Report view
    • Environment variables
    • Git and GitHub
  • Connect to Data
    • Connect your data to DataLab
    • Sharing a Data Source
    • Airtable
    • Amazon Athena
    • Amazon S3
    • Databricks
    • Dropbox
    • Files
    • Google Drive
    • Google BigQuery
    • Google Sheets
    • MariaDB
    • Microsoft SQL Server
    • MongoDB
    • MotherDuck
    • MySQL
    • Oracle Database
    • PostgreSQL
    • Redshift
    • Snowflake
    • Supabase
  • Guides
    • Publish a notebook
    • Importing data from flat files
    • Resizing plots
    • Show Bokeh and Pyvis plots
  • Resources
    • Pricing
    • Manage group settings
    • Reporting for Group Admins
    • DataLab for education
    • Technical requirements
    • Addressing slow code
    • Address R vulnerability
    • Get help
Powered by GitBook
On this page
  • Plots are slow to generate
  • My cell keeps running forever
  • Database queries are slow
  • Machine learning models are slow to train

Was this helpful?

  1. Resources

Addressing slow code

DataLab runs on performant servers in the cloud. If you are on DataLab Starter, i.e. the free tier of DataLab, your workbooks gets 0.5 vCPUs and 4GB of RAM. If you are on DataLab Premium, your workbook gets 2 vCPU and 16GB of RAM. For more information on the difference between the different plans, check Pricing.

If the work you're doing is resource intensive, you may experience your code being slow to execute. This article lists the most common cases of code running slowly with suggestions on how you can address them.

Plots are slow to generate

  • Try to reduce the amount of data you are plotting, for example, by aggregating over a certain dimension or just taking a sample of the data.

  • Some plot types are notoriously resource intensive to generate, for example, swarm plots. Consider another plot type.

My cell keeps running forever

  • Your cell might contain an infinite loop, check for while loops that might have a condition that's never false.

Database queries are slow

  • The database you are querying might be slow to respond. In that case, consider restarting the database or making your database server more powerful.

  • You may be querying a lot of rows, resulting a large data transfer to your notebook. In that case, you can try a couple of things:

    • Make your query more specific so you fetch only the data you need.

    • Limit the number of rows returned by your query until you're certain the rows contain the data you need, and only then perform a query without a limit.

    • If you want to aggregate the result of the query, consider doing the aggregation in SQL rather than in Python, so a lot of computation already happens on the database.

Machine learning models are slow to train

  • Some machine learning models benefit from GPUs when training. DataLab does not provide GPU machines yet. As an alternative, you can train the model on your own computer and then upload the trained weights to your workbook.

  • If you're on DataLab Starter and your workload can be paralllellized, consider upgrading to DataLab Premium to use more vCPUs for the training.

PreviousTechnical requirementsNextAddress R vulnerability

Last updated 11 months ago

Was this helpful?