I'm having trouble displaying legends on a scatter.
import pandas as pd
import numpy as np
import scipy.stats
import matplotlib
import matplotlib.pyplot as pp
import re
import mailbox
import csv
from IPython import display
from ipywidgets import interact, widgets
%matplotlib inline
datos = pd.read_csv('https://raw.githubusercontent.com/theengineeringworld/statistics-using-python/master/gapminder.xls')
datosDos = datos.dropna()
datosDos.reset_index(drop=True, inplace=True)
I am using the following function:
def plotyearDinamico(anio):
data = datosDos[datosDos.year == anio]
area = 10e-6 * datosDos.population
color = data.region.map({'Africa': 'skyblue', 'Europe' : 'gold',
'America' : 'palegreen', 'Asia' : 'coral'})
data.plot.scatter('babies_per_woman', 'age5_surviving', s = area, c =
color, linewidth = 1, edgecolors = 'k', figsize=(9,6))
pp.axis(ymin = 50, ymax = 105, xmin = 0, xmax = 8)
pp.xlabel('babies_per_woman')
pp.ylabel('age5_surviving')
Then calling that function I see the following graph:
interact(plotyearDinamico, anio=widgets.IntSlider(min=1950, max=2015,
step=1, value= 1950))
I know it's a legend() but I can't find how to put it on this chart.
Source: Example of what I need
How can I do to show the names of the continents since I only have them differentiated by colors? Something more or less like this:
import matplotlib.pyplot as plt
from numpy.random import rand
fig, ax = plt.subplots()
for color in ['red', 'green', 'blue']:
n = 750
x, y = rand(2, n)
scale = 200.0 * rand(n)
ax.scatter(x, y, c=color, s=scale, label=color,
alpha=0.3, edgecolors='none')
ax.legend()
ax.grid(True)
plt.show()
If I add pp.legend(data.region) to my graph, it shows the following (The area and color match but it only brings one)
Various changes to your code:
data_region = data[data.region == region]
scatter
ofmatplotlib
so we must properly configure the parametersx
ey
, by means of:x=data_region.babies_per_woman, y=data_region.age5_surviving
alpha=0.7
, to be able to appreciate the overlaps (somewhat at least)label=region
,region
in this case it is a string with its name.It seems to me that your code is correct, but the labels are empty, the texts are not defined.
I'll give you this example: https://www.dkrz.de/up/services/analysis/visualization/sw/ncl/examples/source_code/dkrz-ncl-scatter-plot-with-legend-example
Where you see that you have to define the texts before displaying it:
Notice that these labels do appear on the graph:
Now I see more clearly, it seems to me that
Load a single label and not the array. It should load the entire array.