Accessing and storing your data from numerous sources is vital whether you are working as a web developer, data analyst, or data scientist. You can use any program such as Python, R and Tableau when accessing your Strata Scratch database. For this tutorial, I will show you the simple steps in connecting to your Strata Scratch database using the Python language.
In this guide, we will be using the Jupyter notebook for running our Python codes. Before you begin accessing your Strata Scratch database, make sure you have installed the Jupyter notebook as well as the package for PostgreSQL,
psycopg2 (refer to this guide for instructions on how to install
psycopg2 on Jupyter notebooks. There are many ways to install the Jupyter notebook, and one of the easiest ways is to download and install Anaconda. This is the most common Python distribution which already includes the necessary Python packages. Make sure you download the latest version compatible to your computer OS.
If you have Anaconda/Jupyter Notebooks and psycopg2 installed, you can get started immediately by downloading our template.
Otherwise, you can create a notebook using the instructions below.
Throughout this tutorial, you are expected to have a background in SQL and Python programming. Otherwise, I would recommend you to review these languages before you proceed on the steps below.
The Jupyter notebook will be automatically launched to your default browser. Here, you will see a dashboard containing a list of notebook folders.
Untitledlabel on top of the menu bar.
mypython. Then click Rename to save the changes.
Now you have created a Python notebook, let’s start by importing the packages necessary to connect to our Strata Scratch database.
psycopg2modules to your file. The module
pandasis a Python package that offers easy-to-use analysis tools and data structures for manipulating your tables, including
DataFramewhich we will be using in this tutorial.
On the other hand, we also need to import the
psycopg2 module so we can interact with our Strata Scratch database with Python.
Take note that we used the keyword
as to create the new names
ps which basically refers to the same object,
host_name is the local host, which is in our example stratascratch.com. The
dbname is the name of your database.
pwd are the username and password you used for your database. You can find your username and password on your Strata Scratch profile. Simply log-in to your account and click profile:
5432 is generally the default port, unless it is specified.
In the statement above, we are trying to connect to the database using
ps.connect with the parameters we have previously encoded. Once you are able to open to the database, this block of code will return a connection object. If the connection is successful, the output will display the ‘Connected’ message; otherwise, you will see an error message. This condition is defined within the
try … else statement.
The first line of code creates a cursor object which will allow you to execute the queries.
The next statement executes the SQL queries. In this example, we are simply retrieving the
airbnb_searches table under the schema
strata_user. We added the statement
LIMIT 50 because we are only interested in retrieving the first 50 of the data.
The third line calls the
cur.fetchall to fetch all the rows of data from the database table.
Once you have executed the query, the
conn.commit method is called to commit the changes in the database.
And the last statement calls the
DataFrame method to tabulate the results into rows and columns.
dfobject after the
You should see the following output:
Lastly, don’t forget to add
cur.close() on the last statement. This command will tell the server to automatically close the cursor after all the results have been fetched or the cursor has been idle for a certain period.