Generate data might be important, but collecting data manually that meets our needs would take time. For that reason, we could try to synthesize our data with programming language. This article will outline my top 3 python package to generate synthetic data. All the generated data could be used for any data project you want. Let’s get into it. Click here for more info.
Category: Python
Python – Object-Oriented Programming
Python, like every other object-oriented language, allows you to define classes to create objects. In-built Python classes are the most common data types in Python, such as strings, lists, dictionaries, and so on.
A class is a collection of instance variables and related methods that define a particular object type. You can think of a class as an object’s blueprint or template. Attributes are the names given to the variables that make up a class. Click here for more info.
10 Useful Tools for Python Developers
Whether you need Python tools for data science, machine learning, web development, or anything in between, this list has you covered. Here’s the link to the article.
How to Create a Python Dash Plotly Dashboard App
In this tutorial, I will discuss and go through a practical example on how to create a Python Dash Plotly App. I will create multiple charts for Data Visualization using Dynamic Callbacks which is also known as Pattern Matching Callbacks from Plotly.com. I will use data of The World Population to create the Dashboard App.
Introduction:
Pattern Matching Callbacks – Creating different charts for Data Visualization with callbacks. The users get much more power and control over the App. It gives the users much more flexibility to create callbacks for every set of inputs and outputs that doesn’t yet exist in the App.
MATCH
will fire the callback when any of the component’s properties change. However, instead of passing all of the values into the callback, MATCH
will pass just a single value into the callback. Instead of updating a single output, it will update the dynamic output that is “matched” with.
Install / Import Python necessary Libraries:
Let’s get started. Import the following libraries as listed below: I’m using Anaconda Jupyter Notebook, launch the CMD Prompt and install the following libraries if you don’t currently have them installed on your computer.
import dash #pip install dash from dash import dcc from dash import html from dash.dependencies import Input, Output, ALL, State, MATCH, ALLSMALLER import plotly.express as px #pip install plotly==5.2.2 import pandas as pd #pip install pandas import numpy as np #pip install numpy
Get Data:
We then read in the Panda data frame file. I have download the file to my computer but you can get it from my Github repository link.
df = pd.read_csv("Documents/Data Science/population.csv") #https://github.com/Valnjee/datascience/blob/master/population.csv
print(df)
country year population 0 China 2020.0 1.439324e+09 1 China 2019.0 1.433784e+09 2 China 2018.0 1.427648e+09 3 China 2017.0 1.421022e+09 4 China 2016.0 1.414049e+09 ... ... ... ... 4180 United States 1965.0 1.997337e+08 4181 United States 1960.0 1.867206e+08 4182 United States 1955.0 1.716853e+08 4183 India 1960.0 4.505477e+08 4184 India 1955.0 4.098806e+08 [4185 rows x 3 columns]
Cleanse Data:
Make sure to clean the data by dropping all the null values.
# dropping null values df = df.dropna()
print(df.head(10))
country year population 0 China 2020.0 1.439324e+09 1 China 2019.0 1.433784e+09 2 China 2018.0 1.427648e+09 3 China 2017.0 1.421022e+09 4 China 2016.0 1.414049e+09 5 China 2015.0 1.406848e+09 6 China 2010.0 1.368811e+09 7 China 2005.0 1.330776e+09 8 China 2000.0 1.290551e+09 9 China 1995.0 1.240921e+09
Form and App Layout Design:
Here we design the layout in HTML with the button. Every option will go into the children.
app = dash.Dash(__name__)
app.layout = html.Div([ html.H1("The World Population Dashboard with Dynamic Callbacks", style={"textAlign":"center"}), html.Hr(), html.P("Add as many charts for Data Visualization:"), html.Div(children=[ html.Button('Add Chart', id='add-chart', n_clicks=0), ]), html.Div(id='container', children=[]) ])
First Callback:
The new child is append to the div_children. Every click triggers the callback, then you get another child to append to the div_children with everything created in it. The dcc.RadioItems have options of 4 charts.
Output – displays the chart.
State – saves the input of the children.
@app.callback( Output('container', 'children'), [Input('add-chart', 'n_clicks')], [State('container', 'children')] ) def display_graphs(n_clicks, div_children): new_child = html.Div( style={'width': '45%', 'display': 'inline-block', 'outline': 'thin lightgrey solid', 'padding': 10}, children=[ dcc.Graph( id={ 'type': 'dynamic-graph', 'index': n_clicks }, figure={} ), dcc.RadioItems( id={ 'type': 'dynamic-choice', 'index': n_clicks }, options=[{'label': 'Bar Chart', 'value': 'bar'}, {'label': 'Line Chart', 'value': 'line'}, {'label': 'Scatter Chart', 'value': 'scatter'}, {'label': 'Pie Chart', 'value': 'pie'}], value='bar', ), dcc.Dropdown( id={ 'type': 'dynamic-dpn-s', 'index': n_clicks }, options=[{'label': s, 'value': s} for s in np.sort(df['country'].unique())], multi=True, value=["United States", "China"], ), dcc.Dropdown( id={ 'type': 'dynamic-dpn-ctg', 'index': n_clicks }, options=[{'label': c, 'value': c} for c in ['country']], value='country', clearable=False ), dcc.Dropdown( id={ 'type': 'dynamic-dpn-num', 'index': n_clicks }, options=[{'label': n, 'value': n} for n in ['population']], value='population', clearable=False ) ] ) div_children.append(new_child) return div_children html.Br()
Second Callback and create Graphs:
- The
display_dropdowns
callback returns two elements with the sameindex
: a dropdown and a div. - The second callback uses the
MATCH
selector. With this selector, we’re asking Dash to:- Fire the callback whenever the
value
property of any component with the id'type': 'dynamic-dropdown'
changes:Input({'type': 'dynamic-dropdown', 'index': MATCH}, 'value')
- Update the component with the id
'type': 'dynamic-output'
and theindex
that matches the sameindex
of the input:Output({'type': 'dynamic-output', 'index': MATCH}, 'children')
- Pass along the
id
of the dropdown into the callback:State({'type': 'dynamic-dropdown', 'index': MATCH}, 'id')
- Fire the callback whenever the
- With the
MATCH
selector, only a single value is passed into the callback for eachInput
orState
. - Notice how it’s important to design IDs dictionaries that “line up” the inputs with outputs. The
MATCH
contract is that Dash will update whichever output has the same dynamic ID as the id. In this case, the “dynamic ID” is the value of theindex
and we’ve designed our layout to return dropdowns & divs with identical values ofindex
. - In some cases, it may be important to know which dynamic component changed. As above, you can access this by setting
id
asState
in the callback. - You can also use
dash.callback_context
to access the inputs and state and to know which input changed.outputs_list
is particularly useful withMATCH
because it can tell you which dynamic component this particular invocation of the callback is responsible for updating. Here is what that data might look like with two dropdowns rendered on the page after we change the first dropdown.
The second callback renders the chart interactively. It uses a dictionary of ‘type and ‘index’. The dynamic part of the callback is the input – component_id and the component_property which is the value. Input will trigger when the value of the component_id is changed which refers to the dynamic-dpn-s. The index is going to be matched with the ‘index’ : MATCH = 1.
dff – Always make a copy of the data frame.
Sometimes the user wants to see the data in different charts. With the multiple charts and dropdown options, the user gets to select the different countries he/she is interested in.
@app.callback( Output({'type': 'dynamic-graph', 'index': MATCH}, 'figure'), [Input(component_id={'type': 'dynamic-dpn-s', 'index': MATCH}, component_property='value'), Input(component_id={'type': 'dynamic-dpn-ctg', 'index': MATCH}, component_property='value'), Input(component_id={'type': 'dynamic-dpn-num', 'index': MATCH}, component_property='value'), Input({'type': 'dynamic-choice', 'index': MATCH}, 'value')] ) def update_graph(s_value, ctg_value, num_value, chart_choice): print(s_value) dff = df[df['country'].isin(s_value)] if chart_choice == 'bar': dff = dff.groupby([ctg_value], as_index=False)[['population']].sum() fig = px.bar(dff, x='country', y=num_value) return fig elif chart_choice == 'line': if len(s_value) == 0: return {} else: dff = dff.groupby([ctg_value, 'year'], as_index=False)[['population']].sum() fig = px.line(dff, x='year', y=num_value, color=ctg_value) return fig elif chart_choice == 'scatter': if len(s_value) == 1: return {} else: dff = dff.groupby([ctg_value, 'year'], as_index=False)[['population']].sum() fig = px.scatter(dff, x='year', y=num_value, color=ctg_value) return fig elif chart_choice == 'pie': fig = px.pie(dff, names=ctg_value, values=num_value) return fig
Here is the link on how to setup a development server.
if __name__ == '__main__': app.run_server(debug=False)
Dash is running on http://127.0.0.1:8050/ * Serving Flask app "__main__" (lazy loading) * Environment: production WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Debug mode: off

Conclusion:
CONGRATULATIONS! You have just learnt how to develop Web apps. Dash Plotly gives data scientists the power to build web apps to interact with data, deep learning, artificial intelligence and machine learning models.
In this introductory article, we’ve explored how to develop dashboard apps using Dash Plotly. Although it’s a trivial application, it illustrates the core concepts of this technology. Besides development, we’ve also seen how effortless it is to code in Plotly.
Dash is the original low-code framework for rapidly building data apps in Python, R, Julia, and F# (experimental).
Written on top of Plotly.js and React.js, Dash is ideal for building and deploying data apps with customized user interfaces. It’s particularly suited for anyone who works with data.
Through a couple of simple patterns, Dash abstracts away all of the technologies and protocols that are required to build a full-stack web app with interactive data visualization.
Dash is simple enough that you can bind a user interface to your code in less than 10 minutes.
Dash apps are rendered in the web browser. You can deploy your apps to VMs or Kubernetes clusters and then share them through URLs. Since Dash apps are viewed in the web browser, Dash is inherently cross-platform and mobile ready.
There is a lot behind the framework. To learn more about how it is built and what motivated Dash, read their announcement letter or their post Dash is React for Python.
Dash is an open source library released under the permissive MIT license. Plotly develops Dash and also offers a platform for writing and deploying Dash apps in an enterprise environment. If you’re interested, please get in touch.
Web Apps are great for Data Visualization and gives the clients more flexibilities to navigate and maneuver the data. It’s very user friendly and aid in simplifying the understanding of the DATA.
Related Topics:
Data Visualization Using Python
Data Visualization Using Python
In this example we’ll perform different Data Visualization charts on Population Data. There’s an easy way to create visuals directly from Pandas, and we’ll see how it works in detail in this tutorial.
Install neccessary Libraries
To easily create interactive visualizations, we need to install Cufflinks. This is a library that connects Pandas with Plotly, so we can create visualizations directly from Pandas (in the past you had to learn workarounds to make them work together, but now it’s simpler) First, make sure you install Pandas and Plotly running the following commands on the terminal:
Install the following labraries in the this order – on Conda CMD prompt pip install pandas pip install plotly pip install cufflinks
Import the following Libraries
import pandas as pd import cufflinks as cf from IPython.display import display,HTML cf.set_config_file(sharing='public',theme='ggplot',offline=True)
In this case, I’m using the ‘ggplot’ theme, but feel free to choose any theme you want. Run the command cf.getThemes() to get all the themes available. To create data visualization with Pandas in the following sections, we only need to use the syntaxdataframe.iplot().
The data we’ll use is a population dataframe. First, download the CSV file from Kaggle.com, move the file where your Python script is located, and then read it in a Pandas dataframe as shown below.
#Format year column to number with no decimals df_population = pd.read_csv('documents/population/population.csv')
#use a list of indexes: print(df_population.loc[[0,10]])
country year population 0 China 2020.0 1.439324e+09 10 China 1990.0 1.176884e+09
print(df_population.head(10))
country year population 0 China 2020.0 1.439324e+09 1 China 2019.0 1.433784e+09 2 China 2018.0 1.427648e+09 3 China 2017.0 1.421022e+09 4 China 2016.0 1.414049e+09 5 China 2015.0 1.406848e+09 6 China 2010.0 1.368811e+09 7 China 2005.0 1.330776e+09 8 China 2000.0 1.290551e+09 9 China 1995.0 1.240921e+09
This dataframe is almost ready for plotting, we just have to drop null values, reshape it and then select a couple of countries to test our interactive plots. The code shown below does all of this.
# dropping null values df_population = df_population.dropna()
# reshaping the dataframe df_population = df_population.pivot(index="year", columns="country", values="population")
# selecting 5 countries df_population = df_population[['United States', 'India', 'China', 'Nigeria', 'Spain']]
print(df_population.head(10))
country United States India China Nigeria Spain year 1955.0 171685336.0 4.098806e+08 6.122416e+08 41086100.0 29048395.0 1960.0 186720571.0 4.505477e+08 6.604081e+08 45138458.0 30402411.0 1965.0 199733676.0 4.991233e+08 7.242190e+08 50127921.0 32146263.0 1970.0 209513341.0 5.551898e+08 8.276014e+08 55982144.0 33883749.0 1975.0 219081251.0 6.231029e+08 9.262409e+08 63374298.0 35879209.0 1980.0 229476354.0 6.989528e+08 1.000089e+09 73423633.0 37698196.0 1985.0 240499825.0 7.843600e+08 1.075589e+09 83562785.0 38733876.0 1990.0 252120309.0 8.732778e+08 1.176884e+09 95212450.0 39202525.0 1995.0 265163745.0 9.639226e+08 1.240921e+09 107948335.0 39787419.0 2000.0 281710909.0 1.056576e+09 1.290551e+09 122283850.0 40824754.0
Lineplot
Let’s make a lineplot to compare how much the population has grown from 1955 to 2020 for the 5 countries selected. As mentioned before, we will use the syntax df_population.iplot(kind=’name_of_plot’) to make plots as shown below.
df_population.iplot(kind='line',xTitle='Years', yTitle='Population', title='Population (1955-2020)')

Barplot
We can make a single barplot on barplots grouped by categories. Let’s have a look.
Single Barplot
Let’s create a barplot that shows the population of each country by the year 2020. To do so, first, we select the year 2020 from the index and then transpose rows with columns to get the year in the column. We’ll name this new dataframe df_population_2020 (we’ll use this dataframe again when plotting piecharts)
df_population_2020 = df_population[df_population.index.isin([2020])] df_population_2020 = df_population_2020.T
Now we can plot this new dataframe with .iplot(). In this case, I’m going to set the bar color to blue using the color argument.
df_population_2020.iplot(kind='bar', color='blue', xTitle='Years', yTitle='Population', title='Population in 2020')

Barplot grouped by “n” variables
Now let’s see the evolution of the population at the beginning of each decade.
# filter years out df_population_sample = df_population[df_population.index.isin([1980, 1990, 2000, 2010, 2020])] # plotting df_population_sample.iplot(kind='bar', xTitle='Years', yTitle='Population')

Naturally, all of them increased their population throughout the years, but some did it at a faster rate.
Boxplot
Boxplots are useful when we want to see the distribution of the data. The boxplot will reveal the minimum value, first quartile (Q1), median, third quartile (Q3), and maximum value. The easiest way to see those values is by creating an interactive visualization. Let’s see the population distribution of the China.
df_population['China'].iplot(kind='box', color='green', yTitle='Population')

Let’s say now we want to get the same distribution but for all the selected countries.
df_population.iplot(kind='box', xTitle='Countries', yTitle='Population')

As we can see, we can also filter out any country by clicking on the legends on the right.
Histogram
A histogram represents the distribution of numerical data. Let’s see the population distribution of the USA and Nigeria.
df_population[['United States', 'Nigeria']].iplot(kind='hist', xTitle='Population')

Piechart
Let’s compare the population by the year 2020 again but now with a piechart. To do so, we’ll use the df_population_2020 dataframe created in the “Single Barplot” section. However, to make a piechart we need the “country” as a column and not as an index, so we use .reset_index() to get the column back. Then we transform the 2020 into a string.
# transforming data df_population_2020 = df_population_2020.reset_index() df_population_2020 =df_population_2020.rename(columns={2020:'2020'}) # plotting df_population_2020.iplot(kind='pie', labels='country', values='2020', title='Population in 2020 (%)')

Scatterplot
Although population data is not suitable for a scatterplot (the data follows a common pattern), I would make this plot for the purposes of this guide. Making a scatterplot is similar to a line plot, but we have to add the mode argument.
df_population.iplot(kind='scatter', mode='markers')

Whaola! Now you’re ready to make your own beautiful interactive visualization with Pandas.