No-code Exploratory Data Analysis: DataProfile

--

dataprofile.herokuapp.com

In this article, I am going to explain how to create a pandas-profiling app using Python and Streamlit, which is deployed on Heroku (dataprofile.herokuapp.com). The developed dashboard allows the user to perform exploratory data analysis with no-code! Just drag and drop your data to the dashboard and let the magic happen!

Link to program: dataprofile.herokuapp.com

The example data used in the video can be found here.

I suggest the readers skim through my previous articles: How to Deploy a Streamlit App with Heroku and Pandas Profiling: Exploratory Data Analysis for further details on Heroku and Pandas Profiling.

Getting Started — Generating app.py file

There are four libraries required to generate this app.

Pandas: required to work with tabular data.

Streamlit: the library used to generate the dashboard. See this link for more details.

Pandas-profiling: a great tool that creates exploratory analysis using pandas data frames. See this link for more details.

Streamlit_pandas_profiling: will be used to embed the report into streamlit dashboard.

Streamlit has a setting we can use to set the page configuration layout. I like using the wide option. This is not mandatory.

Let’s create the function to load the data. For this, I am using the pd.read_csv() function to read the data as a dataframe using pandas.

Let’s create a sidebar titled “Upload data”. Here a file uploader is introduced to load the data into the app.

On the other hand, an option selection (st.selectbox) is introduced to choose the profiling mode in the pandas profiling module.

The below code can be used to have an error message/reminder for the user to load the data from the panel.

The last part of the code works after the data is uploaded to the dashboard. First, the data is read as a data frame using pandas. Then, depending on the selected option, the ProfileReport is created and saved as pr.

Once the profile report is ready, it is reported in the dashboard using st_profile_report(pr).

Full code can be seen below:

Deploying to Heroku

The same method explained in my previous article (How to Deploy a Streamlit App with Heroku) was used to deploy the streamlit app to Heroku. Below is a snapshot of the process (takes about 3–4 min).

Conclusion

In conclusion, now there is an app on the cloud (dataprofile.herokuapp.com), which can be used to perform no-code exploratory data analysis.

Special thanks to the developers of streamlit, pandas, and pandas-profiler.

References:

How to Deploy a Streamlit App with Heroku

Pandas Profiling: Exploratory Data Analysis

Access to the dashboard: dataprofile.herokuapp.com

Source code: https://github.com/sercangul/dataprofile

Follow me on GitHub: https://github.com/sercangul

Follow me for more information on Python, statistics, and machine learning!

--

--

Sercan Gul | Data Scientist | DataScientistTX
Nerd For Tech

Senior Data Scientist @ Pioneer | Ph.D Engineering & MS Statistics | UT Austin