\n"
]
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" year | \n",
" month | \n",
" day | \n",
" dayofweek | \n",
" dep_time | \n",
" crs_dep_time | \n",
" arr_time | \n",
" crs_arr_time | \n",
" carrier | \n",
" flight_num | \n",
" ... | \n",
" taxi_in | \n",
" taxi_out | \n",
" cancelled | \n",
" cancellation_code | \n",
" diverted | \n",
" carrier_delay | \n",
" weather_delay | \n",
" nas_delay | \n",
" security_delay | \n",
" late_aircraft_delay | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 2008.0 | \n",
" 11.0 | \n",
" 15.0 | \n",
" 6.0 | \n",
" 1411.0 | \n",
" 1420.0 | \n",
" 1535.0 | \n",
" 1546.0 | \n",
" b'OO' | \n",
" 4391.0 | \n",
" ... | \n",
" 5.0 | \n",
" 11.0 | \n",
" 0.0 | \n",
" None | \n",
" 0.0 | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" 1 | \n",
" 2008.0 | \n",
" 11.0 | \n",
" 28.0 | \n",
" 5.0 | \n",
" 1222.0 | \n",
" 1230.0 | \n",
" 1345.0 | \n",
" 1356.0 | \n",
" b'OO' | \n",
" 4391.0 | \n",
" ... | \n",
" 5.0 | \n",
" 15.0 | \n",
" 0.0 | \n",
" None | \n",
" 0.0 | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" 2 | \n",
" 2008.0 | \n",
" 11.0 | \n",
" 22.0 | \n",
" 6.0 | \n",
" 1414.0 | \n",
" 1420.0 | \n",
" 1540.0 | \n",
" 1546.0 | \n",
" b'OO' | \n",
" 4391.0 | \n",
" ... | \n",
" 5.0 | \n",
" 10.0 | \n",
" 0.0 | \n",
" None | \n",
" 0.0 | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" 3 | \n",
" 2008.0 | \n",
" 11.0 | \n",
" 15.0 | \n",
" 6.0 | \n",
" 1304.0 | \n",
" 1305.0 | \n",
" 1507.0 | \n",
" 1519.0 | \n",
" b'OO' | \n",
" 4392.0 | \n",
" ... | \n",
" 10.0 | \n",
" 9.0 | \n",
" 0.0 | \n",
" None | \n",
" 0.0 | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" 4 | \n",
" 2008.0 | \n",
" 11.0 | \n",
" 22.0 | \n",
" 6.0 | \n",
" 1323.0 | \n",
" 1305.0 | \n",
" 1536.0 | \n",
" 1519.0 | \n",
" b'OO' | \n",
" 4392.0 | \n",
" ... | \n",
" 5.0 | \n",
" 21.0 | \n",
" 0.0 | \n",
" None | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 17.0 | \n",
"
\n",
" \n",
"
\n",
"
5 rows × 29 columns
\n",
"
"
],
"text/plain": [
" year month day dayofweek dep_time crs_dep_time arr_time \\\n",
"0 2008.0 11.0 15.0 6.0 1411.0 1420.0 1535.0 \n",
"1 2008.0 11.0 28.0 5.0 1222.0 1230.0 1345.0 \n",
"2 2008.0 11.0 22.0 6.0 1414.0 1420.0 1540.0 \n",
"3 2008.0 11.0 15.0 6.0 1304.0 1305.0 1507.0 \n",
"4 2008.0 11.0 22.0 6.0 1323.0 1305.0 1536.0 \n",
"\n",
" crs_arr_time carrier flight_num ... taxi_in taxi_out cancelled \\\n",
"0 1546.0 b'OO' 4391.0 ... 5.0 11.0 0.0 \n",
"1 1356.0 b'OO' 4391.0 ... 5.0 15.0 0.0 \n",
"2 1546.0 b'OO' 4391.0 ... 5.0 10.0 0.0 \n",
"3 1519.0 b'OO' 4392.0 ... 10.0 9.0 0.0 \n",
"4 1519.0 b'OO' 4392.0 ... 5.0 21.0 0.0 \n",
"\n",
" cancellation_code diverted carrier_delay weather_delay nas_delay \\\n",
"0 None 0.0 NaN NaN NaN \n",
"1 None 0.0 NaN NaN NaN \n",
"2 None 0.0 NaN NaN NaN \n",
"3 None 0.0 NaN NaN NaN \n",
"4 None 0.0 0.0 0.0 0.0 \n",
"\n",
" security_delay late_aircraft_delay \n",
"0 NaN NaN \n",
"1 NaN NaN \n",
"2 NaN NaN \n",
"3 NaN NaN \n",
"4 0.0 17.0 \n",
"\n",
"[5 rows x 29 columns]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"flights = airline_flights.to_dask().persist()\n",
"print(type(flights))\n",
"flights.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Die Plot-API\n",
"\n",
"Die Schnittstellen\n",
"\n",
"* `dask.dataframe.DataFrame.hvplot`\n",
"* `pandas.DataFrame.hvplot`\n",
"* `intake.DataSource.plot`\n",
"\n",
"und ihre `Series`-Äquivalente bieten eine leistungsfähige High-Level-API um auch komplexe Plots erzeugen zu können. Dabei kann die `.hvplot`-API entweder direkt oder als Namespace verwendet werden, um bestimmte Plottypen zu generieren."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Die expliziteste Methode zur Verwendung der Plot-API besteht darin, die Namen der Spalten anzugeben, die auf der x- bzw. y-Achse geplottet werden sollen:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":Curve [Year] (Violent Crime rate)"
]
},
"execution_count": 9,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "2108"
}
},
"output_type": "execute_result"
}
],
"source": [
"crime.hvplot.line(x=\"Year\", y=\"Violent Crime rate\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Zusätzlich kann auch noch der Diagrammtyp mit `kind` angegeben werden:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":Scatter [Year] (Violent Crime rate)"
]
},
"execution_count": 10,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "2218"
}
},
"output_type": "execute_result"
}
],
"source": [
"crime.hvplot(x=\"Year\", y=\"Violent Crime rate\", kind=\"scatter\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Mit der `by`-Variable könnt ihr die Daten in einer oder mehreren zusätzlichen Spalten gruppieren. Als Beispiel wird im Folgenden die Abfahrtsverzögerung (`\"depdelay\"`) als Funktion der `\"distance\"` dargestellt und die Daten nach `\"carrier\"` gruppiert:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":NdOverlay [carrier]\n",
" :Scatter [distance] (depdelay)"
]
},
"execution_count": 11,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "2329"
}
},
"output_type": "execute_result"
}
],
"source": [
"flight_subset = flights[flights.carrier.isin([b\"OH\", b\"F9\"])]\n",
"flight_subset.hvplot(\n",
" x=\"distance\",\n",
" y=\"depdelay\",\n",
" by=\"carrier\",\n",
" kind=\"scatter\",\n",
" alpha=0.2,\n",
" persist=True,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Im obigen Beispiel haben wir die x- und y-Achsen explizit angegeben.\n",
"\n",
"Andernfalls würde für die x-Achse die pandas-Indexspalte verwendet werden und für die y-Achse alle Nicht-Index-Spalten mit der Standardbezeichnung `value`. Wollt ihr nur die y-Achsenbezeichnung explizit angeben, so steht euch die `value_label`-Option zur Verfügung."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":NdOverlay [Variable]\n",
" :Curve [Year] (Rate (per 100k people))"
]
},
"execution_count": 12,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "2552"
}
},
"output_type": "execute_result"
}
],
"source": [
"crime.hvplot(\n",
" x=\"Year\",\n",
" y=[\"Violent Crime rate\", \"Robbery rate\", \"Burglary rate\"],\n",
" value_label=\"Rate (per 100k people)\",\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Der `hvplot`-Namespace \n",
"\n",
"Statt des `kind`-Argument können wir auch den `hvplot`-Namespace für den Plotaufruf zu verwenden. Die unterstützten Plottypen lassen sich leicht ermitteln mit der Tab-Vervollständigung, also\n",
"``` Python\n",
"crime.hvplot.TAB\n",
"```\n",
"\n",
"Verfügbare Diagrammtypen sind:\n",
"\n",
"* [area()](#area()) zeichnet ein Flächendiagramm ähnlich einem Liniendiagramm, außer dass der Bereich unter der Kurve gefüllt und optional gestapelt wird\n",
"* [bar()](#bar()) zeichnet ein Flächendiagramm ähnlich einem Liniendiagramm, außer dass der Bereich unter der Kurve gefüllt und optional gestapelt wird\n",
"* [bivariate()](#bivariate()) zeichnet ein Flächendiagramm ähnlich einem Liniendiagramm, außer dass der Bereich unter der Kurve gefüllt und optional gestapelt wird\n",
"* [box()](#box()) zeichnet ein [Box-Whisker-Diagramm](https://de.wikipedia.org/wiki/Box-Plot), in dem die Verteilung einer oder mehrerer Variablen verglichen wird\n",
"* [heatmap()](#heatmap()) zeichnet Hex-Bins\n",
"* [hexbin()](#hexbin()) zeichnet die Verteilung eines oder mehrerer Histogramme als Satz von Containern\n",
"* `histogram()` zeichnet die Kernel-Dichteschätzung einer oder mehrerer Variablen\n",
"* [kde()](#kde(),-density()) zeichnet die Kernel-Dichteschätzung einer oder mehrerer Variablen\n",
"* `line()` zeichnet ein Liniendiagramm (z.B. für eine Zeitreihe)\n",
"* [step()](#step()) zeichnet ein Schrittdiagramm, das einem Liniendiagramm ähnelt\n",
"* [scatter()](#scatter()) zeichnet ein Streudiagramm, in dem zwei Variablen verglichen werden \n",
"* [table()](#table()) erzeugt eine SlickGrid-Datentabelle\n",
"* `violin()` zeichnet ein Violinen-Diagramm, in dem die Verteilung einer oder mehrerer Variablen mithilfe der Kernel-Dichteschätzung verglichen wird"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `area()`\n",
"\n",
"Wie die meisten anderen Diagrammtypen unterstützt das `area`-Diagramm die drei oben beschriebenen Möglichkeiten zum Definieren eines Diagramms. Ein Flächendiagramm ist am nützlichsten, wenn mehrere Variablen in einem gestapelten Diagramm dargestellt werden. Dies kann durch Angabe für die `x`-, `y`- und `by`-Spalten erreicht werden oder mit `columns` und `index`/`use_index` als Optionen für die `x`-Achse."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":NdOverlay [Variable]\n",
" :Area [Year] (value,Baseline)"
]
},
"execution_count": 13,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "2823"
}
},
"output_type": "execute_result"
}
],
"source": [
"crime.hvplot.area(x=\"Year\", y=[\"Robbery\", \"Aggravated assault\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Wir können auch explizit `stacked` auf `False` setzen und einen `alpha`-Wert definieren und um die Werte direkt vergleichen zu können:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":NdOverlay [Variable]\n",
" :Area [Year] (value)"
]
},
"execution_count": 14,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "3045"
}
},
"output_type": "execute_result"
}
],
"source": [
"crime.hvplot.area(\n",
" x=\"Year\", y=[\"Aggravated assault\", \"Robbery\"], stacked=False, alpha=0.4\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Eine andere Verwendung für ein Flächendiagramm besteht darin, die Streuung eines Werts zu visualisieren. Wenn wir beispielsweise den Flugdatensatz verwenden, möchten wir möglicherweise die Streuung der mittleren Verspätungswerte zwischen den Fluggesellschaften sehen. Zu diesem Zweck berechnen wir die mittlere Verzögerung nach Tag und Carrier und dann die minimale/maximale mittlere Verzögerung für alle Carrier. Da die Ausgabe von `hvplot` nur ein reguläres Holoviews-Objekt ist, können wir den Overlay-Operator (`*`) verwenden, um die Diagramme übereinander zu platzieren."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":Overlay\n",
" .Area.I :Area [day] (amin,amax)\n",
" .Curve.Carrier_delay :Curve [day] (carrier_delay)"
]
},
"execution_count": 15,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "3267"
}
},
"output_type": "execute_result"
}
],
"source": [
"delay_min_max = (\n",
" flights.groupby([\"day\", \"carrier\"])[\"carrier_delay\"]\n",
" .mean()\n",
" .groupby(\"day\")\n",
" .agg([np.min, np.max])\n",
")\n",
"delay_mean = flights.groupby(\"day\")[\"carrier_delay\"].mean()\n",
"\n",
"delay_min_max.hvplot.area(\n",
" x=\"day\", y=\"amin\", y2=\"amax\", alpha=0.2\n",
") * delay_mean.hvplot()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `bar()`\n",
"\n",
"Im einfachsten Fall können wir `.hvplot.bar` verwenden. Um die Beschriftung auf der x-Achse um 90° zu drehen, geben wir noch `rot=90` an."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":Bars [Year] (Violent Crime rate)"
]
},
"execution_count": 16,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "3482"
}
},
"output_type": "execute_result"
}
],
"source": [
"crime.hvplot.bar(x=\"Year\", y=\"Violent Crime rate\", rot=90)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Wenn wir stattdessen mehrere Spalten vergleichen möchten, können wir eine Liste von Spalten festlegen. Mit der `stacked`-Option können wir dann die Spaltenwerte einfacher vergleichen:"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":Bars [Year,Variable] (value)"
]
},
"execution_count": 17,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "3591"
}
},
"output_type": "execute_result"
}
],
"source": [
"crime.hvplot.bar(\n",
" x=\"Year\",\n",
" y=[\"Violent crime total\", \"Property crime total\"],\n",
" stacked=True,\n",
" rot=90,\n",
" width=800,\n",
" legend=\"top_left\",\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `scatter()`\n",
"\n",
"Das Streudiagramm unterstützt viele der Funktionen der obigen Diagrammtypen, kann jedoch mit der `c`-Option auch eingefärbt werden."
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":Scatter [Violent Crime rate] (Burglary rate,Year)"
]
},
"execution_count": 18,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "3709"
}
},
"output_type": "execute_result"
}
],
"source": [
"crime.hvplot.scatter(x=\"Violent Crime rate\", y=\"Burglary rate\", c=\"Year\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Um Farbe zur Darstellung einer Dimension zu verwenden, kann die `cmap`-Option genutzt werden werden, um die zu verwendende Farbkarte anzugeben. Zusätzlich kann die Farbleiste deaktiviert werden mit `colorbar=False`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `step()`\n",
"\n",
"Ein Schrittdiagramm ist einem Liniendiagramm sehr ähnlich, aber anstatt linear zwischen Abtastwerten zu interpolieren, visualisiert das Schrittdiagramm diskrete Schritte. Die Position der Schritte kann mit dem `where`-Schlüsselwort un den Werten `\"pre\"`, `\"mid\"` (Standard) und `\"post\"` gesteuert werden."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":NdOverlay [Variable]\n",
" :Curve [Year] (value)"
]
},
"execution_count": 19,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "3834"
}
},
"output_type": "execute_result"
}
],
"source": [
"crime.hvplot.step(x=\"Year\", y=[\"Robbery\", \"Aggravated assault\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `hexbin()`\n",
"\n",
"Mit der `hexbin`-Methode können Sie hexagonale Bin-Diagramme erstellen. Sie können eine nützliche Alternative zu Streudiagrammen sein, wenn die Daten zu dicht sind, um jeden Punkt einzeln zu zeichnen. Da unsere Flugdaten nicht gleichmäßig auf einer linearen Skala verteilt sind, verwenden wir die `logz`-Option für eine logarithmische Skala."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":HexTiles [airtime,arrdelay]"
]
},
"execution_count": 20,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "4055"
}
},
"output_type": "execute_result"
}
],
"source": [
"flights.hvplot.hexbin(\n",
" x=\"airtime\", y=\"arrdelay\", width=600, height=500, logz=True\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `bivariate()`\n",
"\n",
"Mit der `bivariate`-Methode könnt ihr ein 2D-Dichtediagramm erstellen. Bivariate Diagramme sind neben Hexbin-Diagrammen eine weitere Alternative zu Streudiagrammen, wenn die Daten zu dicht sind, um jeden Punkt einzeln zu zeichnen."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":Bivariate [Violent Crime rate,Burglary rate] (Density)"
]
},
"execution_count": 21,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "4240"
}
},
"output_type": "execute_result"
}
],
"source": [
"crime.hvplot.bivariate(\n",
" x=\"Violent Crime rate\", y=\"Burglary rate\", width=600, height=500\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `heatmap()`\n",
"\n",
"`heatmap` kann die Beziehung zwischen drei Variablen anzeigen und neben den Variablen `'x'` und `'y'` zusätzlich `'C'` anzeigen. Zusätzlich werden mit der `reduce_function` die Werte für jeden Container aus den Stichproben berechnet."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":HeatMap [day,carrier] (depdelay)"
]
},
"execution_count": 22,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "4369"
}
},
"output_type": "execute_result"
}
],
"source": [
"flights.compute().hvplot.heatmap(\n",
" x=\"day\", y=\"carrier\", C=\"depdelay\", reduce_function=np.mean, colorbar=True\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `table()`\n",
"\n",
"Im Gegensatz zu allen anderen Plottypen kann für eine Tabelle nur angegeben werden, ob alle Spalten oder mit `columns` nur eine Teilmenge angezeigt werden soll."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":Table [Year,Population,Violent Crime rate]"
]
},
"execution_count": 23,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "4502"
}
},
"output_type": "execute_result"
}
],
"source": [
"crime.hvplot.table(\n",
" columns=[\"Year\", \"Population\", \"Violent Crime rate\"], width=400\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `hist()`\n",
"\n",
"Das Zeichnen von Verteilungen unterscheidet sich geringfügig von anderen Plots, da sie im einfachen Fall nur eine Variable darstellen. Daher muss bei diesem Plottyp kein `index` oder `x`-Wert angegeben werden, sondern\n",
"\n",
"* deklariert eine einzelne `y`-Variable, z.B. `source.plot.hist(variable)` oder\n",
"* deklariert eine `y`-Variable und eine `by`-Variable, z.B. `source.plot.hist(variable, by=\"Group\")` oder\n",
"* deklariert Spalten oder zeichnet alle Spalten, z.B. `source.plot.hist()` oder `source.plot.hist(columns=[\"A\", \"B\", \"C\"])`"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":Histogram [Violent Crime rate] (Violent Crime rate_count)"
]
},
"execution_count": 24,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "4521"
}
},
"output_type": "execute_result"
}
],
"source": [
"crime.hvplot.hist(y=\"Violent Crime rate\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Alternativ önnen wir auch die Verteilung mehrerer Spalten darstellen:"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":NdOverlay [Element]\n",
" :Histogram [Burglary rate] (Burglary rate_count)"
]
},
"execution_count": 25,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "4633"
}
},
"output_type": "execute_result"
}
],
"source": [
"columns = [\"Violent Crime rate\", \"Property crime rate\", \"Burglary rate\"]\n",
"crime.hvplot.hist(y=columns, bins=50, alpha=0.5, legend=\"top\", height=400)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Wir können die Daten auch nach anderen Variablen gruppieren und die Carrier in eigene `subplots` aufteilen:"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":NdLayout [carrier]\n",
" :Histogram [depdelay] (depdelay_count)"
]
},
"execution_count": 26,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "5066"
}
},
"output_type": "execute_result"
}
],
"source": [
"flight_subset = flights[flights.carrier.isin([b\"AA\", b\"US\", b\"OH\"])]\n",
"flight_subset.hvplot.hist(\n",
" \"depdelay\",\n",
" by=\"carrier\",\n",
" bins=20,\n",
" bin_range=(-20, 100),\n",
" width=300,\n",
" subplots=True,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `kde()`, `density()`\n",
"\n",
"Ihr könnt Dichtediagramme auch mit `hvplot.kde()` oder `hvplot.density()` erstellen:"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":Distribution [Violent Crime rate] (Density)"
]
},
"execution_count": 27,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "5430"
}
},
"output_type": "execute_result"
}
],
"source": [
"crime.hvplot.kde(y=\"Violent Crime rate\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Der Vergleich der Verteilung mehrerer Spalten ist ebenfalls möglich:"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":NdOverlay [Variable]\n",
" :Distribution [Rate] (Density)"
]
},
"execution_count": 28,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "5542"
}
},
"output_type": "execute_result"
}
],
"source": [
"columns = [\"Violent Crime rate\", \"Property crime rate\", \"Burglary rate\"]\n",
"crime.hvplot.kde(y=columns, alpha=0.5, value_label=\"Rate\", legend=\"top_right\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`hvplot.kde` unterstützt auch das `by`-Schlüsselwort:"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":NdLayout [carrier]\n",
" :Distribution [depdelay] (Density)"
]
},
"execution_count": 29,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "5973"
}
},
"output_type": "execute_result"
}
],
"source": [
"flight_subset = flights[flights.carrier.isin([b\"AA\", b\"US\", b\"OH\"])]\n",
"flight_subset.hvplot.kde(\n",
" \"depdelay\", by=\"carrier\", xlim=(-20, 70), width=300, subplots=True\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `box()`\n",
"\n",
"Genau wie die anderen verteilungsbasierten Diagrammtypen unterstützt das [Box-Whisker-Diagramm](https://de.wikipedia.org/wiki/Box-Plot) das Zeichnen einer einzelnen Spalte:"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":BoxWhisker (Violent Crime rate)"
]
},
"execution_count": 30,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "6336"
}
},
"output_type": "execute_result"
}
],
"source": [
"crime.hvplot.box(y=\"Violent Crime rate\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Es unterstützt auch mehrere Spalten und die gleichen Optionen wie berits oben genannt:`legend`, `invert` und `value_label`:"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":BoxWhisker [Crime] (Rate (per 100k))"
]
},
"execution_count": 31,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "6573"
}
},
"output_type": "execute_result"
}
],
"source": [
"columns = [\n",
" \"Burglary rate\",\n",
" \"Larceny-theft rate\",\n",
" \"Motor vehicle theft rate\",\n",
" \"Property crime rate\",\n",
" \"Violent Crime rate\",\n",
"]\n",
"crime.hvplot.box(\n",
" y=columns,\n",
" group_label=\"Crime\",\n",
" legend=False,\n",
" value_label=\"Rate (per 100k)\",\n",
" invert=True,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Auch die Verwendung des `by`-Schlüsselworts zum Aufteilen der Daten in mehrere Teilmengen wird unterstützt:"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":BoxWhisker [carrier] (depdelay)"
]
},
"execution_count": 32,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "6810"
}
},
"output_type": "execute_result"
}
],
"source": [
"flight_subset = flights[flights.carrier.isin([b\"AA\", b\"US\", b\"OH\"])]\n",
"flight_subset.hvplot.box(\"depdelay\", by=\"carrier\", ylim=(-10, 70))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Zusammengesetzte Diagramme\n",
"\n",
"Eine der Hauptstärken von HoloViews ist die einfache Erstellung verschiedener Diagramme. Einzelne Diagramme können mit den Operatoren `*` und `+` überlagert bzw. zusammengesetzt werden.\n",
"\n",
"\n",
"\n",
"**Siehe auch:**\n",
"\n",
"* [Composing Elements](https://holoviews.org/user_guide/Composing_Elements.html)\n",
"
"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":Overlay\n",
" .Curve.I :Curve [Year] (Violent Crime rate)\n",
" .Scatter.I :Scatter [Year] (Violent Crime rate)"
]
},
"execution_count": 33,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "7067"
}
},
"output_type": "execute_result"
}
],
"source": [
"crime.hvplot(x=\"Year\", y=\"Violent Crime rate\") * crime.hvplot.scatter(\n",
" x=\"Year\", y=\"Violent Crime rate\", c=\"k\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Wir können auch verschiedene Diagramme und Tabellen zusammen erstellen:"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":Layout\n",
" .Bars.I :Bars [Year] (Violent Crime rate)\n",
" .Table.I :Table [Year,Population,Violent Crime rate]"
]
},
"execution_count": 34,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "7346"
}
},
"output_type": "execute_result"
}
],
"source": [
"(\n",
" crime.hvplot.bar(x=\"Year\", y=\"Violent Crime rate\", rot=90, width=550)\n",
" + crime.hvplot.table(\n",
" [\"Year\", \"Population\", \"Violent Crime rate\"], width=420\n",
" )\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Big Data\n",
"\n",
"In den vorherigen Beispielen fassten wir den relativ große Airline-Datensatz zusammen indem wir Teilmengen für die Darstellung bildeten. Stattdessen können wir die Daten jedoch auch mithilfe von [Datashader](../../datashader.ipynb) aggregieren, wobei der gesamte verfügbare Rohdatensatz gerendert wird (sofern die Auflösung des Bildschirms dies zulässt)."
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"
\n",
" \n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"\n",
"
\n",
"
\n",
""
],
"text/plain": [
":DynamicMap []\n",
" :RGB [distance,airtime] (R,G,B,A)"
]
},
"execution_count": 35,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "7497"
}
},
"output_type": "execute_result"
}
],
"source": [
"flights.hvplot.scatter(x=\"distance\", y=\"airtime\", datashade=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `groupby`\n",
"\n",
"Dank der Fähigkeit von HoloViews, einen Parameterraum mit einer Reihe von Widgets zu erkunden, können wir eine Gruppe entlang einer bestimmten Spalte oder Dimension anwenden, z.B. die Verteilung der Abflugverzögerungen nach Carriern und Tag gruppiert anzeigen, wobei Benutzer\\*innen auswählen können, welcher Tag angezeigt werden soll:"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"application/vnd.holoviews_exec.v0+json": "",
"text/html": [
"\n",
"
\n",
"
\n",
" \n",
"
\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"
\n",
"
\n",
" \n",
"
\n",
"
\n",
"
\n",
"
\n",
""
],
"text/plain": [
":DynamicMap [dayofweek]\n",
" :Violin [carrier] (depdelay)"
]
},
"execution_count": 36,
"metadata": {
"application/vnd.holoviews_exec.v0+json": {
"id": "7607"
}
},
"output_type": "execute_result"
}
],
"source": [
"flights.hvplot.violin(\n",
" y=\"depdelay\", by=\"carrier\", groupby=\"dayofweek\", ylim=(-20, 60), height=500\n",
")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.11 Kernel",
"language": "python",
"name": "python311"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
"autoclose": false,
"autocomplete": true,
"bibliofile": "biblio.bib",
"cite_by": "apalike",
"current_citInitial": 1,
"eqLabelWithNumbers": true,
"eqNumInitial": 1,
"hotkeys": {
"equation": "Ctrl-E",
"itemize": "Ctrl-I"
},
"labels_anchors": false,
"latex_user_defs": false,
"report_style_numbering": false,
"user_envs_cfg": false
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"state": {},
"version_major": 2,
"version_minor": 0
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}