Nousot logo

Running SQL Queries

The Engineering Platforms stores data using the lakehouse architecture, providing a flexible and scalable data layer. However, it's still possible to run SQL queries on the lakehouse to provide a familiar interface for engineerings, analysts, and data data scientists alike.

Prerequisites

In order to connect to the lakehouse and run SQL queries you'll need to download a SQL client, such as DBeaver. Any SQL client that can connect to Apache Spark or Apache Hive will be able to connect to the lakehouse.

You'll also need to create a personal access token, which will act as your password when connecting.

We generally recommend that the lakehouse not be accesssible over the public internet, meaning that connectivity will require a connection to a VPN or other private network. Be sure to check your networking connectivity before attempting to connect your SQL client.

Connecting

In your SQL client of choice, create a new connection. When selecting connection type, select either "Apache Spark" or "Apache Hive". For the host, enter "sql", followed by the URL of the instance you're connecting to. For example, if the instance domain is "analytics.nousot.com", the host for the lakehouse would be "sql.analytics.nousot.com". The port should be 10000.

Querying

Once connected, the lakehouse behaves like a typical SQL database - your SQL client of choice will show you any databases and tables that you have access to. The lakehouse uses Spark SQL, so functions from that dialect will be available.