Data Visualization In Python – A Complete Guide You Need

Sharing is Caring

Data Visualization: The process of finding trends in our data (Quantitative and Qualitative) by representing it in graphical form is called Data Visualization. To perform data visualization in python programming we can use various python programming data visualization modules such as Seaborn, Matplotlib, Plotly, etc. Data visualization is an essential aspect of AI (Artificial intelligence) and ML (machine learning) applications. You can gain key insights into your data through different graphical representations.

Data Visualization In Python
  • Matplotlib and Seaborn
  • Line Charts
  • Bar Graphs
  • Histograms
  • Scatter Plots
  • Heat Maps

Matplotlib

The matplotlib library is a low-level, easy-to-understand library for data visualization in Python programming it is built on NumPy arrays. It allows you to build plots like – bar graphs, scatter plots, histograms, stem charts, step graphs, box plots, pie charts, violin plots, etc.

Data Visualization In Python

To install this library you can use the following command to install the package.

  1. pip install matplotlib
  2. Conda Install Matplotlib

Seaborn:
Seaborn is a library that uses Matplotlib library underneath to plot graphs. It is used to visualize random distributions.

  • import matplotlib.pyplot as plt
  • import seaborn as sns
  • import matplotlib.pyplot as plt
    import seaborn as sns
  • sns.distplot([0, 1, 2, 3, 4, 5])
  • plt.show()

Scatter Plot: A scatter plot is a two-dimensional plot that represents the relationship between any two value and variable. scatter plot the independent variable is plotted on the Y-axis and the dependent variable on the x-axis. Your data contains paired numerical data (0-9).

import matplotlib.pyplot as plt

x = [5,7,7,2,17,2,9,4,11,12,9,6]
y = [99,87,88,111,86,103,87,94,78,77,85,86]

plt.scatter(x, y)
plt.show()

Preparation of Scatter Data

  • Collect data are relationship is suspected.
  • Draw a graph with the independent variable on the vertical axis and the dependent variable on the. Horizontal axis.
  • Divide points on the graph/chart into four quadrants

pip install matplotlib seaborn bokeh

pip install tensorflow pandas

# Importing from tensorflow and keras

from tensorflow.keras.datasets import mnist

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense, Reshape

from tensorflow.keras import utils

from tensorflow import dtypes, tensordot

from tensorflow import convert_to_tensor, linalg, transpose

# For math operations

import numpy as np

# For plotting with matplotlib

import matplotlib.pyplot as plt

# For plotting with seaborn

import seaborn as sns  

# For plotting with bokeh

from bokeh.plotting import figure, show

from bokeh.models import Legend, LegendItem

# For pandas dataframe

import pandas as pd

# load dataset

(x_train, train_labels), (_, _) = mnist.load_data()

  • # Choose only the digits 0, 1, 2.

total_classes = 3

ind = np.where(train_labels < total_classes)

x_train, train_labels = x_train[ind], train_labels[ind]

# Shape of training data

total_examples, img_length, img_width = x_train.shape

# Print the statistics

print(‘Training data has ‘, total_examples, ‘images’)

print(‘Each image of size ‘, img_length, ‘x’, img_width)

Line Charts

A line chart displays the evolution of one or more numerical variables. It is a common chart type, so it can be built using any python viz library, like seaborn, matplotlib or plotly.

Also Read: Vector Quantization in Python with Example

# importing the required libraries

  1. import matplotlib.pyplot as plt
  2. import numpy as np

# define data values

  • x = np.array([1, 2, 3, 4])  # X-axis points
  • y = x*2  # Y-axis points
  • plt.plot(x, y)  # Plot the chart
  • plt.show()  # display

Bar plot

A bar chart is a graph representing the category of raw and facts and figures with rectangular bars with lengths and heights that are proportional to the values represented. The bar plots can be plotted vertically or horizontally. A bar chart describes the comparisons between the discrete values. One of the axes of the plot represents the specific value of the categories being compared, while the other axis represents the measure values corresponding to those value categories.

  1. import NumPy as np
  2. import matplotlib.pyplot as plt
  • # creating the dataset
  • data = {‘C’:20, ‘C++’:15, ‘Java’:30,
  • ‘Python’:35}
  • courses = list(data.keys())
  • values = list(data.values())
  • fig = plt.figure(figsize = (10, 5))

    # creating the bar plot

  • plt.bar(courses, values, color =’maroon’, width = 0.4)
  1. plt.xlabel(“Courses offered”)
  2. plt.ylabel(“No. of students enrolled”)
  3. plt.title(“Students enrolled in different courses”)
  4. plt.show()

Histogram

A histogram chart is a graph showing frequency distributions.

The graph shows the number of observations within each given interval.

  • import numpy as np
  • x = np.random.normal(170, 10, 250)
  • print(x)
  • import matplotlib.pyplot as plt
  • import numpy as np
  • x = np.random.normal(170, 10, 250)
  • plt.hist(x)
  • plt.show() 

Conclusion:

Data visualization provides a good, easy-to-understand, interpreted, organized pictorial representation of the data, observation, analyze. The main goal of data visualization is to make it easier to identify trends, patterns, and outliers in large or huge data sets. including information graphics, statistical graphics, and information visualization. The data visualization can be divided into three main following categories

  1. to explore
  2. to monitor
  3. to explain.

Leave a Comment