생능출판사 (가칭)"데이터과학 파이썬" 코드 11장 차트를 멋지게 그려보자

11.3 matplotlib ¹«ÀÛÁ¤ »ç¿ëÇØ º¸±â

In [1]:
import matplotlib.pyplot as plt 
 
# ¿ì¸®³ª¶óÀÇ ¿¬°£ 1ÀÎ´ç ±¹¹Î¼ÒµæÀ» °¢°¢ years, gdp¿¡ ÀúÀå
years = [1950, 1960, 1970, 1980, 1990, 2000, 2010] 
gdp = [67.0, 80.0, 257.0, 1686.0, 6505, 11865.3, 22105.3] 
 
# ¼± ±×·¡ÇÁ¸¦ ±×¸°´Ù. xÃà¿¡´Â years°ª, yÃà¿¡´Â gdp °ªÀÌ Ç¥½ÃµÈ´Ù.
plt.plot(years, gdp, color='green', marker='o', linestyle='solid') 
 
# Á¦¸ñÀ» ¼³Á¤ÇÑ´Ù. 
plt.title("GDP per capita") 
 
# yÃà¿¡ ·¹À̺íÀ» ºÙÀδÙ. 
plt.ylabel("dollars") 
plt.savefig("gdp_per_capita.png", dpi=600)  # png À̹ÌÁö·Î ÀúÀå °¡´É
plt.show() 

11.4 맷플롯립 코드 살펴 보기

In [2]:
import matplotlib.pyplot as plt
In [3]:
years = [1950, 1960, 1970, 1980, 1990, 2000, 2010] 
gdp = [67.0, 80.0, 257.0, 1686.0, 6505, 11865.3, 22105.3]
In [4]:
plt.plot(years, gdp, color='green', marker='o', linestyle='solid') 
Out[4]:
[<matplotlib.lines.Line2D at 0x1e487affc08>]
In [5]:
plt.title("GDP per capita") 
Out[5]:
Text(0.5, 1.0, 'GDP per capita')
In [6]:
plt.ylabel("dollars")
plt.savefig("gdp_per_capita.png", dpi=600)
plt.show()
In [7]:
plt.plot(gdp, years, color='red', marker='o', linestyle='solid')
Out[7]:
[<matplotlib.lines.Line2D at 0x1e485922208>]

LAB11-1 ¼öÇÐ ÇÔ¼öµµ ½±°Ô ±×·Áº¸ÀÚ

In [8]:
import matplotlib.pyplot as plt

x = [x for x in range(-10, 10)] 
y = [2*t for t in x]            # 2*x¸¦ ¿ø¼Ò·Î °¡Áö´Â y ÇÔ¼ö
plt.plot(x, y, marker='o')      # ¼± ±×·¡ÇÁ¿¡ µ¿±×¶ó¹Ì Ç¥½ÄÀ» Ãâ·Â

plt.axis([-20, 20, -20, 20])    # ±×¸²À» ±×¸± ¿µ¿ªÀ» ÁöÁ¤ÇÔ
plt.show()

11.5 차트 장식을 도와주는 다양한 기법들

In [9]:
import matplotlib.pyplot as plt # È¿À²À» À§Çؼ­ ¾ÕÀ¸·Î ÀÌ ÁÙÀº »ý·«ÇÔ

x = [x for x in range(-20, 20)] # -20¿¡¼­ 20»çÀÌÀÇ ¼ö¸¦ 1ÀÇ °£°ÝÀ¸·Î »ý¼º
y1 = [2*t for t in x]           # 2*x¸¦ ¿ø¼Ò·Î °¡Áö´Â y1 ÇÔ¼ö
y2 = [t**2 + 5 for t in x]      # x**2 + 5¸¦ ¿ø¼Ò·Î °¡Áö´Â y2 ÇÔ¼ö
y3 = [-t**2 - 5 for t in x]     # -x**2 - 5¸¦ ¿ø¼Ò·Î °¡Áö´Â y3 ÇÔ¼ö
# »¡°­»ö Á¡¼±, ³ì»ö ½Ç¼±°ú ¼¼¸ð±âÈ£, ÆĶû»ö º°Ç¥¿Í Á¡¼±À¸·Î °¢°¢ÀÇ ÇÔ¼ö¸¦ Ç¥Çö
plt.plot(x, y1, 'r--', x, y2, 'g^-', x, y3, 'b*:')
plt.axis([-30, 30, -30, 30])    # ±×¸²À» ±×¸± ¿µ¿ªÀ» ÁöÁ¤ÇÔ
plt.show()

11.6 하나의 차트에 여러 개의 데이터를 그려보자

In [10]:
import matplotlib.pyplot as plt 
 
x = [x for x in range(20)]     # 0¿¡¼­ 20±îÁöÀÇ Á¤¼ö¸¦ »ý¼º
y = [x**2 for x in range(20)]  # 0¿¡¼­ 20±îÁöÀÇ Á¤¼ö x¿¡ ´ëÇØ x Á¦°ö°ªÀ» »ý¼º
z = [x**3 for x in range(20)]  # 0¿¡¼­ 20±îÁöÀÇ Á¤¼ö x¿¡ ´ëÇØ x ¼¼Á¦°ö°ªÀ» »ý¼º 
 
plt.plot(x, x, label='linear')    # °¢ ¼±¿¡ ´ëÇÑ ·¹À̺í
plt.plot(x, y, label='quadratic') 
plt.plot(x, z, label='qubic') 
 
plt.xlabel('x label')      # x ÃàÀÇ ·¹À̺í
plt.ylabel('y label')      # y ÃàÀÇ ·¹À̺í
plt.title("My Plot") 
plt.legend() 
plt.show() 

LAB11-2

삼각함수의 기본인 사인 그래프 그리기

In [11]:
import math 
import matplotlib.pyplot as plt 
 
x = [] 
y = [] 
 
for angle in range(360): 
    x.append(angle) 
    y.append(math.sin(math.radians(angle))) 
 
plt.plot(x, y) 
plt.title("SINE WAVE") 
plt.show()

11.7 막대형 차트도 손쉽게 그려보자

In [12]:
from matplotlib import pyplot as plt 
 
# 1ÀÎ´ç ±¹¹Î¼Òµæ 
years = [1950, 1960, 1970, 1980, 1990, 2000, 2010] 
gdp = [67.0, 80.0, 257.0, 1686.0, 6505, 11865.3, 22105.3] 
 
plt.bar(range(len(years)), gdp) 
 
plt.title("GDP per capita")   # Á¦¸ñÀ» ¼³Á¤ÇÑ´Ù. 
plt.ylabel("dollars")         # yÃà¿¡ ·¹ÀÌºí¸¦ ºÙÀδÙ. 
 
# yÃà¿¡ ƽÀ» ºÙÀδÙ. 
plt.xticks(range(len(years)), years) 
plt.show()

11.8 여러나라의 국민소득 추이를 다중 막대형 차트로 그리자

In [13]:
# 1ÀÎ´ç ±¹¹Î¼Òµæ
years = [1965, 1975, 1985, 1995, 2005, 2015]
ko = [130, 650, 2450, 11600, 17790, 27250]
jp = [890, 5120, 11500, 42130, 40560, 38780]
ch = [100, 200, 290, 540, 1760, 7940]
In [14]:
x_range = range(len(years))
plt.bar(x_range, ko, width = 0.25)
plt.bar(x_range, jp, width = 0.25)
plt.bar(x_range, ch, width = 0.25)
Out[14]:
<BarContainer object of 6 artists>
In [15]:
import numpy as np

x_range = np.arange(len(years))
plt.bar(x_range + 0.0, ko, width = 0.25)
plt.bar(x_range + 0.3, jp, width = 0.25)
plt.bar(x_range + 0.6, ch, width = 0.25)
Out[15]:
<BarContainer object of 6 artists>

11.9 데이터를 점으로 표현하는 산포도 그래프 그리기

In [16]:
import matplotlib.pyplot as plt 
import numpy as np 
 
xData = np.arange(20, 50) 
yData = xData + 2*np.random.randn(30)   # xData¿¡ randn() ÇÔ¼ö·Î ÀâÀ½À» ¼¯´Â´Ù.
                                        # ÀâÀ½Àº Á¤±ÔºÐÆ÷·Î ¸¸µé¾î Áú °ÍÀÌ´Ù.
 
plt.scatter(xData, yData) 
plt.title('Real Age vs Physical Age') 
plt.xlabel('Real Age') 
plt.ylabel('Physical Age') 
 
plt.savefig("kkk.png", dpi=600); 
plt.show()

11.10 맛있는 피자가 생각나는 파이 차트

In [17]:
import matplotlib.pyplot as plt 
times = [8, 14, 2] 
In [18]:
timelabels = ["Sleep", "Study", "Play"]
In [19]:
# autopct·Î ¹éºÐÀ²À» Ç¥½ÃÇÒ ¶§ ¼Ò¼öÁ¡ 2¹ø° ÀÚ¸®±îÁö Ç¥½ÃÇÏ°Ô ÇÑ´Ù.
# labels ¸Å°³ º¯¼ö¿¡ timelabels ¸®½ºÆ®¸¦ Àü´ÞÇÑ´Ù.
plt.pie(times, labels = timelabels, autopct = "%.2f") 
plt.show() 
In [20]:
import matplotlib.pyplot as plt 
 
books = [ 1, 6, 2, 3, 1, 2, 0, 2 ] 

# 6°³ÀÇ ºóÀ» ÀÌ¿ëÇÏ¿© books ¾È¿¡ ÀúÀåµÈ ÀÚ·áÀÇ È÷½ºÅä±×·¥ ±×¸®±â
plt.hist(books, bins = 6)   

plt.xlabel("books") 
plt.ylabel("frequency") 
plt.show()

11.11 히스토그램으로 자료의 분포를 한눈에 보아요

In [21]:
books = [ 1, 6, 2, 3, 1, 2, 0, 2 ] 
In [22]:
import matplotlib.pyplot as plt 
 
books = [ 1, 6, 2, 3, 1, 2, 0, 2 ] 

# 6°³ÀÇ ºóÀ» ÀÌ¿ëÇÏ¿© books ¾È¿¡ ÀúÀåµÈ ÀÚ·áÀÇ È÷½ºÅä±×·¥ ±×¸®±â
plt.hist(books, bins = 6)  

plt.xlabel("books") 
plt.ylabel("frequency") 
plt.show()

11.12 겹쳐진 히스토그램도 그리자 : 다중 히스토그램

In [23]:
import numpy as np 
import matplotlib.pyplot as plt 
 
n_bins = 10 
x = np.random.randn(1000) 
y = np.random.randn(1000) 
 
plt.hist(x, n_bins, histtype='bar', color= "red"); 
plt.hist(y, n_bins, histtype='bar', color= "blue", alpha=0.3); 
plt.show()
In [24]:
import matplotlib.pyplot as plt 
 
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
plt.hist(x)
plt.show()

LAB11-3

정규분포로 생성된 난수를 눈으로 확인하기

In [25]:
import numpy as np 
import matplotlib.pyplot as plt 
 
mu1, sigma1 = 10, 2
mu2, sigma2 = -6, 3

standard_Gauss = np.random.randn(10000)
Gauss1 = mu1 + sigma1 * np.random.randn(10000)
Gauss2 = mu2 + sigma2 * np.random.randn(10000) 

plt.figure(figsize=(10,6)) 
plt.hist(standard_Gauss, bins=50, alpha=0.4) 
plt.hist(Gauss1, bins=50, alpha=0.4)
plt.hist(Gauss2, bins=50, alpha=0.4)

plt.show()

11.13 데이터를 효율적으로 표현하는 상자 차트를 알아보자

In [26]:
import numpy as np
import matplotlib.pyplot as plt

random_data = np.random.randn(100)
In [27]:
plt.boxplot(random_data)
plt.show()

11.14 여러 개의 상자 차트 그리기

In [28]:
import numpy as np
import matplotlib.pyplot as plt
data1 = [1, 2, 3, 4, 5]
data2 = [2, 3, 4, 5, 6]
In [29]:
plt.boxplot([ data1, data2 ] )
Out[29]:
{'whiskers': [<matplotlib.lines.Line2D at 0x1e485f32c08>,
  <matplotlib.lines.Line2D at 0x1e485f35b88>,
  <matplotlib.lines.Line2D at 0x1e485f46a88>,
  <matplotlib.lines.Line2D at 0x1e485f46b88>],
 'caps': [<matplotlib.lines.Line2D at 0x1e485f35c88>,
  <matplotlib.lines.Line2D at 0x1e485f3cb08>,
  <matplotlib.lines.Line2D at 0x1e485f4aa48>,
  <matplotlib.lines.Line2D at 0x1e485f4ab48>],
 'boxes': [<matplotlib.lines.Line2D at 0x1e485f32a88>,
  <matplotlib.lines.Line2D at 0x1e485f40b88>],
 'medians': [<matplotlib.lines.Line2D at 0x1e485f3cc08>,
  <matplotlib.lines.Line2D at 0x1e485f50b08>],
 'fliers': [<matplotlib.lines.Line2D at 0x1e485f40a88>,
  <matplotlib.lines.Line2D at 0x1e485f50c08>],
 'means': []}
In [30]:
plt.boxplot(np.array([ data1, data2 ]) )
Out[30]:
{'whiskers': [<matplotlib.lines.Line2D at 0x1e485ede788>,
  <matplotlib.lines.Line2D at 0x1e485ede988>,
  <matplotlib.lines.Line2D at 0x1e486012208>,
  <matplotlib.lines.Line2D at 0x1e485eef2c8>,
  <matplotlib.lines.Line2D at 0x1e485ea3c08>,
  <matplotlib.lines.Line2D at 0x1e486006a08>,
  <matplotlib.lines.Line2D at 0x1e485e8d748>,
  <matplotlib.lines.Line2D at 0x1e485e8dfc8>,
  <matplotlib.lines.Line2D at 0x1e485e65688>,
  <matplotlib.lines.Line2D at 0x1e485e78cc8>],
 'caps': [<matplotlib.lines.Line2D at 0x1e485ef8788>,
  <matplotlib.lines.Line2D at 0x1e485ec0f48>,
  <matplotlib.lines.Line2D at 0x1e485eb1ec8>,
  <matplotlib.lines.Line2D at 0x1e485eb8808>,
  <matplotlib.lines.Line2D at 0x1e485e948c8>,
  <matplotlib.lines.Line2D at 0x1e485e94788>,
  <matplotlib.lines.Line2D at 0x1e485e86548>,
  <matplotlib.lines.Line2D at 0x1e485e81f08>,
  <matplotlib.lines.Line2D at 0x1e485e65048>,
  <matplotlib.lines.Line2D at 0x1e485e94088>],
 'boxes': [<matplotlib.lines.Line2D at 0x1e486038508>,
  <matplotlib.lines.Line2D at 0x1e485eb6888>,
  <matplotlib.lines.Line2D at 0x1e485eb6808>,
  <matplotlib.lines.Line2D at 0x1e485ea3588>,
  <matplotlib.lines.Line2D at 0x1e485e86608>],
 'medians': [<matplotlib.lines.Line2D at 0x1e485ec0e48>,
  <matplotlib.lines.Line2D at 0x1e485eb8a88>,
  <matplotlib.lines.Line2D at 0x1e485e8d108>,
  <matplotlib.lines.Line2D at 0x1e485e78c48>,
  <matplotlib.lines.Line2D at 0x1e485e59e88>],
 'fliers': [<matplotlib.lines.Line2D at 0x1e485eb6748>,
  <matplotlib.lines.Line2D at 0x1e485e9ea48>,
  <matplotlib.lines.Line2D at 0x1e485e8dcc8>,
  <matplotlib.lines.Line2D at 0x1e485e78648>,
  <matplotlib.lines.Line2D at 0x1e485e402c8>],
 'means': []}

11.15 한 화면에 여러 그래프 그리기 : subplots()

In [31]:
import matplotlib.pyplot as plt

fig, ax = plt.subplots(2, 2, figsize=(5, 5))

ax[0, 0].plot(range(10), 'r') #row=0, col=0
ax[1, 0].plot(range(10), 'b') #row=1, col=0
ax[0, 1].plot(range(10), 'g') #row=0, col=1
ax[1, 1].plot(range(10), 'k') #row=1, col=1
plt.show()
In [32]:
fig = plt.figure()
ax1 = fig.add_subplot(2, 2, 1)
ax2 = fig.add_subplot(2, 2, 2)
ax3 = fig.add_subplot(2, 2, 3)
ax4 = fig.add_subplot(2, 2, 4)

LAB11-4

서브플롯 이용해 보기

In [33]:
import matplotlib.pyplot as plt 
import numpy as np 
 
np.random.seed(19680801) 
data = np.random.randn(2, 100) 
 
fig, axs = plt.subplots(2, 2, figsize=(5, 5)) 

axs[0, 0].hist(data[0]) 
axs[1, 0].scatter(data[0], data[1]) 
axs[0, 1].plot(data[0], data[1]) 
axs[1, 1].hist2d(data[0], data[1]) 
 
plt.show()