plotnine-Beispiele#

Einfaches Streudiagramm#

1. Importe#

[1]:
from plotnine import *
from plotnine.data import mtcars

2. Streudiagramm#

[2]:
(ggplot(mtcars, aes('wt', 'mpg'))
 + geom_point())
../../_images/matplotlib_plotnine_examples_5_0.png
[2]:
<ggplot: (8786654708803)>

plotnine.mapping.aes erstellt ästhetische Zuordnungen mit der Meilen je Gallone mpg auf der y-Achse und Gewicht der Autos wt auf der x-Achse. plotnine.geoms.geom_point erstellt dann ein Streudiagramm.

3. Farbliche Unterscheidung der Variablen#

[3]:
(ggplot(mtcars, aes('wt', 'mpg', color='factor(gear)'))
 + geom_point())
../../_images/matplotlib_plotnine_examples_8_0.png
[3]:
<ggplot: (8786646201998)>

4. Geglättetes lineares Modell mit Konfidenzintervallen#

Mit plotnine.stats.stat_smooth lassen sich geglättete bedingte Mittwlwerte berechnen, wobei lm ein lineares Modell zugrundelegt:

[4]:
(ggplot(mtcars, aes('wt', 'mpg', color='factor(gear)'))
 + geom_point()
 + stat_smooth(method='lm'))
../../_images/matplotlib_plotnine_examples_10_0.png
[4]:
<ggplot: (8786646169281)>

5. Darstellung in separierten Feldern#

Mit plotnine.facets.facet_wrap lassen sich die Felder trennen.

[5]:
(ggplot(mtcars, aes('wt', 'mpg', color='factor(gear)'))
 + geom_point()
 + stat_smooth(method='lm')
 + facet_wrap('~gear'))
../../_images/matplotlib_plotnine_examples_12_0.png
[5]:
<ggplot: (8786646108338)>

Interaktive Diagramme#

Zusammen mit ipywidgets lassen sich auch interaktive Diagramme erstellen.

1. Importe#

[6]:
import matplotlib.pyplot as plt
import plotnine as p9
import pandas as pd
import numpy as np
from copy import copy
from ipywidgets import widgets
from IPython.display import display

from plotnine.data import mtcars

2. Interaktives Streudiagramm erstellen#

Im folgenden betrachten wir PS auf der x-Achse, Meilen je Gallone auf der y-Achse und unterscheiden farblich das Gewicht der Autos:

[7]:
%matplotlib notebook

p = p9.ggplot(mtcars, p9.aes(x="hp", y="mpg", color="wt")) + \
        p9.geom_point() + p9.theme_linedraw()
p
[7]:
<ggplot: (8786645966029)>

3. Nun wählen wir die Autos anhand der Zylinderanzahl aus:#

[8]:
# Prepre the list we will use to selec sub-sets of data based on number of cylinders.
cylList = np.unique( mtcars['cyl'] )

# The first selection is a drop-down menu for number of cylinders
cylSelect = widgets.Dropdown(
    options=list(cylList),
    value=cylList[1],
    description='Cylinders:',
    disabled=False,
)

# For the widgets to update the same plot, instead of creating one new image every time
# a selection changes. We keep track of the matplotlib image and axis, so we create only one
# figure and set of axis, for the first plot, and then just re-use the figure and axis
# with plotnine's "_draw_using_figure" function.
fig = None
axs = None

# This is the main function that is called to update the plot every time we chage a selection.
def plotUpdate(*args):

    # Use global variables for matplotlib's figure and axis.
    global fig, axs

    # Get current values of the selection widget
    cylValue = cylSelect.value

    # Create a temporary dataset that is constrained by the user's selections.
    tmpDat = mtcars.loc[(mtcars['cyl'] == cylValue),:]

    # Create plotnine's plot

    # Using the maximum and minimum values we gatehred before, we can keep the plot axis from
    # changing with the cyinder selection
    p = p9.ggplot(tmpDat, p9.aes(x="hp", y="mpg", color="wt")) + \
        p9.geom_point() + p9.theme_linedraw()

    if fig is None:
        # If this is the first time a plot is made in the notebook, we let plotnine create a new
        # matplotlib figure and axis.
        fig, plot = p.draw(return_ggplot=True)
        axs = plot.axs
    else:

        #p = copy(p)
        # This helps keeping old selected data from being visualized after a new selection is made.
        # We delete all previously reated artists from the matplotlib axis.
        for artist in plt.gca().lines +\
                        plt.gca().collections +\
                        plt.gca().artists + plt.gca().patches + plt.gca().texts:
            artist.remove()

        # If a plot is being updated, we re-use the figure an axis created before.
        p._draw_using_figure(fig, axs)


cylSelect.observe(plotUpdate, 'value')

# Display the widgets
display(cylSelect)

# Plots the first image, with inintial values.
plotUpdate()

# Matplotlib function to make the image fit within the plot dimensions.
plt.tight_layout()

# Trick to get the first rendered image to follow the previous "tight_layout" command.
# without this, only after the first update would the figure be fit inside its dimensions.
cylSelect.value = cylList[0]

# The first selection is a drop-down menu for number of cylinders
cylSelect = widgets.Dropdown(
    options=list(cylList),
    value=cylList[1],
    description='Cylinders:',
    disabled=False,
)

# For the widgets to update the same plot, instead of creating one new image every time
# a selection changes. We keep track of the matplotlib image and axis, so we create only one
# figure and set of axis, for the first plot, and then just re-use the figure and axis
# with plotnine's "_draw_using_figure" function.
fig = None
axs = None

# This is the main function that is called to update the plot every time we chage a selection.
def plotUpdate(*args):

    # Use global variables for matplotlib's figure and axis.
    global fig, axs

    # Get current values of the selection widget
    cylValue = cylSelect.value

    # Create a temporary dataset that is constrained by the user's selections.
    tmpDat = mtcars.loc[(mtcars['cyl'] == cylValue),:]

    # Create plotnine's plot

    # Using the maximum and minimum values we gatehred before, we can keep the plot axis from
    # changing with the cyinder selection
    p = p9.ggplot(tmpDat, p9.aes(x="hp", y="mpg", color="wt")) + \
        p9.geom_point() + p9.theme_linedraw()

    if fig is None:
        # If this is the first time a plot is made in the notebook, we let plotnine create a new
        # matplotlib figure and axis.
        fig, plot = p.draw(return_ggplot=True)
        axs = plot.axs
    else:

        #p = copy(p)
        # This helps keeping old selected data from being visualized after a new selection is made.
        # We delete all previously reated artists from the matplotlib axis.
        for artist in plt.gca().lines +\
                        plt.gca().collections +\
                        plt.gca().artists + plt.gca().patches + plt.gca().texts:
            artist.remove()

        # If a plot is being updated, we re-use the figure an axis created before.
        p._draw_using_figure(fig, axs)


cylSelect.observe(plotUpdate, 'value')

# Display the widgets
display(cylSelect)

# Plots the first image, with inintial values.
plotUpdate()

# Matplotlib function to make the image fit within the plot dimensions.
plt.tight_layout()

# Trick to get the first rendered image to follow the previous "tight_layout" command.
# without this, only after the first update would the figure be fit inside its dimensions.
cylSelect.value = cylList[0]

4. Drop-Down-Menü hinzufügen#

Nun erstellen wir ein Drop-Down-Menü zur Auswahl der Zylinder

[9]:
# We now get the maximum ranges of relevant variables to keep axis constant between images.

# Get range of weight
minWt = min(mtcars['wt'])
maxWt = max(mtcars['wt'])
# We get all unique values of weigh, sort them, and transform the numpy.array into a python list.
wtOptions = list( np.sort(np.unique(mtcars.loc[mtcars['cyl']==cylList[0],'wt']))  )

minHP = min(mtcars['hp'])
maxHP = max(mtcars['hp'])

minMPG = min(mtcars['mpg'])
maxMPG = max(mtcars['mpg'])

# The first selection is a drop-down menu for number of cylinders
cylSelect = widgets.Dropdown(
    options=list(cylList),
    value=cylList[1],
    description='Cylinders:',
    disabled=False,
)

# For the widgets to update the same plot, instead of creating one new image every time
# a selection changes. We keep track of the matplotlib image and axis, so we create only one
# figure and set of axis, for the first plot, and then just re-use the figure and axis
# with plotnine's "_draw_using_figure" function.
fig = None
axs = None

# This is the main function that is called to update the plot every time we chage a selection.
def plotUpdate(*args):

    # Use global variables for matplotlib's figure and axis.
    global fig, axs

    # Get current values of the selection widget
    cylValue = cylSelect.value

    # Create a temporary dataset that is constrained by the user's selections.
    tmpDat = mtcars.loc[(mtcars['cyl'] == cylValue),:]

    # Create plotnine's plot

    # Using the maximum and minimum values we gatehred before, we can keep the plot axis from
    # changing with the cyinder selection
    p = p9.ggplot(tmpDat, p9.aes(x="hp", y="mpg", color="wt")) + \
        p9.geom_point() + p9.theme_linedraw() + \
        p9.xlim([minHP, maxHP]) + p9.ylim([minMPG, maxMPG]) + \
        p9.scale_color_continuous(limits=(minWt, maxWt))

    if fig is None:
        fig, plot = p.draw(return_ggplot=True)
        axs = plot.axs
    else:
        #p = copy(p)
        for artist in plt.gca().lines +\
                        plt.gca().collections +\
                        plt.gca().artists + plt.gca().patches + plt.gca().texts:
            artist.remove()
        p._draw_using_figure(fig, axs)


cylSelect.observe(plotUpdate, 'value')

# Display the widgets
display(cylSelect)

# Plots the first image, with inintial values.
plotUpdate()

# Matplotlib function to make the image fit within the plot dimensions.
plt.tight_layout()

# Trick to get the first rendered image to follow the previous "tight_layout" command.
# without this, only after the first update would the figure be fit inside its dimensions.
cylSelect.value = cylList[0]

5. Bereichsregler hinzufügen#

Nun schränken wir mit einem Bereichsregler die Daten basierend auf dem Fahrzeuggewicht ein:

[10]:
    # The first selection is a drop-down menu for number of cylinders
    cylSelect = widgets.Dropdown(
        options=list(cylList),
        value=cylList[1],
        description='Cylinders:',
        disabled=False,
    )

    # The second selection is a range of weights
    wtSelect = widgets.SelectionRangeSlider(
        options=wtOptions,
        index=(0,len(wtOptions)-1),
        description='Weight',
        disabled=False
    )

    widgetsCtl = widgets.HBox([cylSelect, wtSelect])

    # The range of weights needs to always be dependent on the cylinder selection.
    def updateRange(*args):
        '''Updates the selection range from the slider depending on the cylinder selection.'''
        cylValue = cylSelect.value

        wtOptions = list( np.sort(np.unique(mtcars.loc[mtcars['cyl']==cylValue,'wt']))  )

        wtSelect.options = wtOptions
        wtSelect.index = (0,len(wtOptions)-1)

    cylSelect.observe(updateRange,'value')

    # For the widgets to update the same plot, instead of creating one new image every time
    # a selection changes. We keep track of the matplotlib image and axis, so we create only one
    # figure and set of axis, for the first plot, and then just re-use the figure and axis
    # with plotnine's "_draw_using_figure" function.
    fig = None
    axs = None

    # This is the main function that is called to update the plot every time we chage a selection.
    def plotUpdate(*args):

        # Use global variables for matplotlib's figure and axis.
        global fig, axs

        # Get current values of the selection widgets
        cylValue = cylSelect.value
        wrRange = wtSelect.value

        # Create a temporary dataset that is constrained by the user's selections.
        tmpDat = mtcars.loc[(mtcars['cyl'] == cylValue) & \
                            (mtcars['wt'] >= wrRange[0]) & \
                            (mtcars['wt'] <= wrRange[1]),:]

        # Create plotnine's plot

        p = p9.ggplot(tmpDat, p9.aes(x="hp", y="mpg", color="wt")) + \
            p9.geom_point() + p9.theme_linedraw() + \
            p9.xlim([minHP, maxHP]) + p9.ylim([minMPG, maxMPG]) + \
            p9.scale_color_continuous(limits=(minWt, maxWt))

        if fig is None:
            fig, plot = p.draw(return_ggplot=True)
            axs = plot.axs
        else:

            for artist in plt.gca().lines +\
                            plt.gca().collections +\
                            plt.gca().artists + plt.gca().patches + plt.gca().texts:
                artist.remove()
            p._draw_using_figure(fig, axs)


    cylSelect.observe(plotUpdate, 'value')
    wtSelect.observe(plotUpdate, 'value')

    # Display the widgets
    display(widgetsCtl)

    # Plots the first image, with inintial values.
    plotUpdate()

    # Matplotlib function to make the image fit within the plot dimensions.
    plt.tight_layout()

    # Trick to get the first rendered image to follow the previous "tight_layout" command.
    # without this, only after the first update would the figure be fit inside its dimensions.
    cylSelect.value = cylList[0]

6. Diagramm optimieren#

Schließlich ändern wir noch einige Diagrammeigenschaften, um eine verständlichere Abbildung zu erhalten:

[11]:
    # The first selection is a drop-down menu for number of cylinders
    cylSelect = widgets.Dropdown(
        options=list(cylList),
        value=cylList[1],
        description='Cylinders:',
        disabled=False,
    )

    # The second selection is a range of weights
    wtSelect = widgets.SelectionRangeSlider(
        options=wtOptions,
        index=(0,len(wtOptions)-1),
        description='Weight',
        disabled=False
    )

    widgetsCtl = widgets.HBox([cylSelect, wtSelect])

    # The range of weights needs to always be dependent on the cylinder selection.
    def updateRange(*args):
        '''Updates the selection range from the slider depending on the cylinder selection.'''
        cylValue = cylSelect.value

        wtOptions = list( np.sort(np.unique(mtcars.loc[mtcars['cyl']==cylValue,'wt']))  )

        wtSelect.options = wtOptions
        wtSelect.index = (0,len(wtOptions)-1)

    cylSelect.observe(updateRange,'value')

    fig = None
    axs = None

    # This is the main function that is called to update the plot every time we chage a selection.
    def plotUpdate(*args):

        # Use global variables for matplotlib's figure and axis.
        global fig, axs

        # Get current values of the selection widgets
        cylValue = cylSelect.value
        wrRange = wtSelect.value

        # Create a temporary dataset that is constrained by the user's selections of
        # number of cylinders and weight.
        tmpDat = mtcars.loc[(mtcars['cyl'] == cylValue) & \
                            (mtcars['wt'] >= wrRange[0]) & \
                            (mtcars['wt'] <= wrRange[1]),:]

        # Create plotnine's plot showing all data ins smaller grey points, and
        # the selected data with coloured points.
        p = p9.ggplot(tmpDat, p9.aes(x="hp", y="mpg", color="wt") ) + \
            p9.geom_point(mtcars, p9.aes(x="hp", y="mpg"), color="grey") + \
            p9.geom_point(size=3) + p9.theme_linedraw() + \
            p9.xlim([minHP, maxHP]) + p9.ylim([minMPG, maxMPG]) + \
            p9.scale_color_continuous(name="spring",limits=(np.floor(minWt), np.ceil(maxWt))) +\
            p9.labs(x = "Horse-Power", y="Miles Per Gallon", color="Weight" )

        if fig is None:
            fig, plot = p.draw(return_ggplot=True)
            axs = plot.axs
        else:

            for artist in plt.gca().lines +\
                            plt.gca().collections +\
                            plt.gca().artists + plt.gca().patches + plt.gca().texts:
                artist.remove()
            p._draw_using_figure(fig, axs)


    cylSelect.observe(plotUpdate, 'value')
    wtSelect.observe(plotUpdate, 'value')

    # Display the widgets
    display(widgetsCtl)

    # Plots the first image, with inintial values.
    plotUpdate()

    # Matplotlib function to make the image fit within the plot dimensions.
    plt.tight_layout()

    # Trick to get the first rendered image to follow the previous "tight_layout" command.
    # without this, only after the first update would the figure be fit inside its dimensions.
    cylSelect.value = cylList[0]