{ "cells": [ { "cell_type": "markdown", "id": "6080af38", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Data visualization, pt. 2 (`seaborn`)" ] }, { "cell_type": "markdown", "id": "118f7491", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Goals of this lecture\n", "\n", "- Introducting `seaborn`. \n", "- Putting `seaborn` into practice:\n", " - **Univariate** plots (histograms). \n", " - **Bivariate** continuous plots (scatterplots and line plots).\n", " - **Bivariate** categorical plots (bar plots, box plots, and strip plots)." ] }, { "cell_type": "markdown", "id": "5fa26f5e", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Introducing `seaborn`" ] }, { "cell_type": "markdown", "id": "6a4ffbb5", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### What is `seaborn`?\n", "\n", "> [`seaborn`](https://seaborn.pydata.org/) is a data visualization library based on `matplotlib`.\n", "\n", "- In general, it's easier to make nice-looking graphs with `seaborn`.\n", "- The trade-off is that `matplotlib` offers more flexibility." ] }, { "cell_type": "code", "execution_count": 27, "id": "6a3c41f6", "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "import seaborn as sns ### importing seaborn\n", "import pandas as pd\n", "import matplotlib.pyplot as plt ## just in case we need it\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": 28, "id": "1c0815db", "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "%matplotlib inline \n", "%config InlineBackend.figure_format = 'retina'" ] }, { "cell_type": "markdown", "id": "b926c887", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### The `seaborn` hierarchy of plot types\n", "\n", "We'll learn more about exactly what this hierarchy means today (and in next lecture).\n", "\n", "" ] }, { "cell_type": "markdown", "id": "914ef46e", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Example dataset\n", "\n", "Today we'll work with a new dataset, from [Gapminder](https://www.gapminder.org/data/documentation/). \n", "\n", "- **Gapminder** is an independent Swedish foundation dedicated to publishing and analyzing data to correct misconceptions about the world.\n", "- Between 1952-2007, has data about `life_exp`, `gdp_cap`, and `population`." ] }, { "cell_type": "code", "execution_count": 29, "id": "6b8d6ac8", "metadata": {}, "outputs": [], "source": [ "df_gapminder = pd.read_csv(\"data/viz/gapminder_full.csv\")" ] }, { "cell_type": "code", "execution_count": 30, "id": "3fcb4d6e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | country | \n", "year | \n", "population | \n", "continent | \n", "life_exp | \n", "gdp_cap | \n", "
---|---|---|---|---|---|---|
0 | \n", "Afghanistan | \n", "1952 | \n", "8425333 | \n", "Asia | \n", "28.801 | \n", "779.445314 | \n", "
1 | \n", "Afghanistan | \n", "1957 | \n", "9240934 | \n", "Asia | \n", "30.332 | \n", "820.853030 | \n", "