생능출판사 (가칭)"데이터과학 파이썬" 코드 9장

9.2 문자열에서 개별 문자들을 뽑아보자

In [1]:
s = 'Monty Python'
s[0]
Out[1]:
'M'
In [2]:
s[6:10]
Out[2]:
'Pyth'
In [3]:
s[-12:-7]
Out[3]:
'Monty'
In [4]:
t = s[:-2]
t
Out[4]:
'Monty Pyth'
In [5]:
t = s[-2:]
t
Out[5]:
'on'
In [6]:
s[:-2] + s[-2:]
Out[6]:
'Monty Python'

9.3 문자열을 분해해 보자

In [7]:
s = 'Welcome to Python' 
s.split()
Out[7]:
['Welcome', 'to', 'Python']
In [8]:
s = '2021.8.15' 
s.split('.')
Out[8]:
['2021', '8', '15']
In [9]:
s = 'Hello, World!' 
s.split(",")
Out[9]:
['Hello', ' World!']
In [10]:
s = 'Hello, World!' 
s.split(', ')
Out[10]:
['Hello', 'World!']
In [11]:
s = 'Welcome, to,  Python, and ,  bla, bla   '
[x.strip() for x in s.split(',')]
Out[11]:
['Welcome', 'to', 'Python', 'and', 'bla', 'bla']
In [12]:
list('Hello, World!')
Out[12]:
['H', 'e', 'l', 'l', 'o', ',', ' ', 'W', 'o', 'r', 'l', 'd', '!']

9.4 문자열을 이어붙이는 것은 파이썬한테는 쉬운 일

In [13]:
','.join(['apple', 'grape', 'banana'])
Out[13]:
'apple,grape,banana'
In [14]:
'-'.join('010.1234.5678'.split('.'))  # .À¸·Î ±¸ºÐµÈ ÀüÈ­¹øÈ£¸¦ ÇÏÀÌÇÂÀ¸·Î °íÄ¡±â
Out[14]:
'010-1234-5678'
In [15]:
'010.1234.5678'.replace('.','-')
Out[15]:
'010-1234-5678'
In [16]:
s = 'hello world'
clist = list(s)
clist
Out[16]:
['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd']
In [17]:
''.join(clist)
Out[17]:
'hello world'
In [18]:
a_string = 'Actions \n\t speak louder than words'
a_string
Out[18]:
'Actions \n\t speak louder than words'
In [19]:
print(a_string)
Actions 
	 speak louder than words
In [20]:
word_list = a_string.split()
word_list
Out[20]:
['Actions', 'speak', 'louder', 'than', 'words']
In [21]:
refined_string = " ".join(word_list)    # a_string¿¡ ÀÖ´Â ÁٹٲÞ, Åǹ®ÀÚ°¡ ¾ø¾îÁø´Ù
print(refined_string)
Actions speak louder than words

9.5 대문자와 소문자 변환, 그리고 문자열 삭제

In [22]:
s = 'Hello, World!'
s.lower()
Out[22]:
'hello, world!'
In [23]:
s.upper()
Out[23]:
'HELLO, WORLD!'
In [24]:
s = "   Hello, World!   "
s.strip() 
Out[24]:
'Hello, World!'
In [25]:
s.lstrip() 
Out[25]:
'Hello, World!   '
In [26]:
s.rstrip() 
Out[26]:
'   Hello, World!'
In [27]:
s = "########this is an example#####"
s.strip('#')
Out[27]:
'this is an example'
In [28]:
s = "########this is an example#####"
s.lstrip('#')
Out[28]:
'this is an example#####'
In [29]:
s.rstrip('#')
Out[29]:
'########this is an example'
In [30]:
s.strip('#').capitalize()
Out[30]:
'This is an example'
In [31]:
s = "www.booksr.co.kr" 
s.find(".kr")
Out[31]:
13
In [32]:
s.find("x")    # 'x' ¹®ÀÚ¿­ÀÌ ¾øÀ» °æ¿ì -1À» ¹ÝȯÇÔ
Out[32]:
-1

9.6 다양한 문자열 처리 함수와 string 모듈

In [33]:
s = 'www.booksr.co.kr'    # »ý´ÉÃâÆÇ»çÀÇ È¨ÆäÀÌÁö
s.count('.')              # . ÀÌ ¸î¹ø ³ªÅ¸³ª´Â°¡¸¦ ¾Ë·ÁÁØ´Ù
Out[33]:
3
In [34]:
s = 'www.booksr.co.kr' 
ord(max(s))   # s¹®ÀÚ¿­ ³»¿¡¼­ À¯´ÏÄÚµå °ªÀÌ °¡Àå Å« °ªÀÇ À¯´ÏÄÚµå °ªÀ» ¹Ýȯ
Out[34]:
119
In [35]:
ord(min(s))   # s¹®ÀÚ¿­ ³»¿¡¼­ À¯´ÏÄÚµå °ªÀÌ °¡Àå ÀÛÀº °ªÀÇ À¯´ÏÄÚµå °ªÀ» ¹Ýȯ
Out[35]:
46
In [36]:
chr(119), chr(46)  # À¯´ÏÄÚµå °ª 119, 46¿¡ ÇØ´çÇÏ´Â ¹®ÀÚ¸¦ ¹Ýȯ
Out[36]:
('w', '.')
In [37]:
import string
src_str = string.ascii_uppercase
print('src_str =', src_str)
src_str = ABCDEFGHIJKLMNOPQRSTUVWXYZ
In [38]:
src_str = string.ascii_uppercase
dst_str = src_str[1:] + src_str[:1]
print('dst_str =', dst_str)
dst_str = BCDEFGHIJKLMNOPQRSTUVWXYZA
In [39]:
n = src_str.index('A')
print('src_strÀÇ A À妽º =', n)
print('src_strÀÇ A À§Ä¡¿¡ ÀÖ´Â dst_strÀÇ ¹®ÀÚ =', dst_str[n])
src_strÀÇ A À妽º = 0
src_strÀÇ A À§Ä¡¿¡ ÀÖ´Â dst_strÀÇ ¹®ÀÚ = B

LAB 9-1 : 카이사르 암호를 만들어 보자

In [40]:
import string

src_str = string.ascii_uppercase
dst_str = src_str[3:] + src_str[:3]

def ciper(a):          # ¾Ïȣȭ Äڵ带 ¸¸µå´Â ÇÔ¼Ò
    idx = src_str.index(a)
    return dst_str[idx]

src = input('¹®ÀåÀ» ÀÔ·ÂÇϽÿÀ: ')
print('¾ÏȣȭµÈ ¹®Àå : ', end='')

for ch in src:
    if ch in src_str:
        print(ciper(ch), end='')
    else:
       print(ch, end='')
  
print()
¹®ÀåÀ» ÀÔ·ÂÇϽÿÀ: ATTACK ON MIDNIGHT
¾ÏȣȭµÈ ¹®Àå : DWWDFN RQ PLGQLJKW

LAB 9-2 : 트위터 메시지 처리의 단어 추출

In [41]:
t = "There's a reason some people are working to make it harder to vote, especially for people of color. It¡¯s because when we show up, things change."

length = len(t.split(" "))
print('word count:', length)
word count: 26

LAB 9-3 : 트위터 메시지의 대문자, 소문자 변환

In [42]:
t = "It's Not The Right Time To Conduct Exams. MY DEMAND IN BOLD AND CAPITAL. NO EXAMS IN COVID!!!"
t
Out[42]:
"It's Not The Right Time To Conduct Exams. MY DEMAND IN BOLD AND CAPITAL. NO EXAMS IN COVID!!!"
In [43]:
t = "It's Not The Right Time To Conduct Exams. MY DEMAND IN BOLD AND CAPITAL. NO EXAMS IN COVID!!!"
l = t.lower()
l
Out[43]:
"it's not the right time to conduct exams. my demand in bold and capital. no exams in covid!!!"

도전문제 9.3

  • Æ®À­ µ¥ÀÌÅÍ¿¡¼­ ´ë¹®ÀÚ³ª ´À³¦Ç¥°¡ ¸¹ÀÌ ³ªÅ¸³ª´Â °ÍÀº ±Û¾²´Â »ç¶÷ÀÇ °¨Á¤ÀÌ ÈïºÐÇϰųª ºÐ³ëÇÑ »óÅÂÀÓÀ» ³ªÅ¸³»´Â °æ¿ì°¡ ¸¹´Ù. ÁÖ¾îÁø ¿øÃµ Æ®À­ µ¥ÀÌÅÍ¿¡¼­ ´ë¹®ÀÚ¿Í ´À³¦Ç¥°¡ ¸î ¹ø »ç¿ëµÇ¾ú´ÂÁö °è»êÇÏ´Â Äڵ带 ÀÛ¼ºÇØ º¸¶ó. ´À³¦Ç¥°¡ ¸î °³ÀÎÁö Çì¾Æ¸®´Â °ÍÀº count() ÇÔ¼ö¸¦ »ç¿ëÇÏ¸é ½±°Ô ÇÒ ¼ö ÀÖ´Ù. ´ë¹®ÀÚ´Â ¾î¶»°Ô ¼¿ ¼ö ÀÖÀ»±î? ¿ø Æ®À­À» list()¸¦ ÀÌ¿ëÇÏ¿© Çϳª ÇϳªÀÇ ¹®ÀÚ·Î ºÐ¸®ÇÏ¿© ¸®½ºÆ®¸¦ ¸¸µé ¼ö ÀÖ´Ù. ±×¸®°í ÀÌ ¸®½ºÆ®ÀÇ °¢ Ç׸ñ ¹®ÀÚ°¡ ch¶ó°í ÇÒ ¶§, ch.isupper()¸¦ È£ÃâÇÏ¸é ´ë¹®ÀÚÀÎ °æ¿ì True°¡ ¹ÝȯµÈ´Ù.
In [44]:
t_lst = list(t)
print('´À³¦Ç¥ °¹¼ö :', t_lst.count('!'))
´À³¦Ç¥ °¹¼ö : 3
In [45]:
count = 0
for ch in t_lst:
  if ch.isupper() == True:
    count += 1

print('´ë¹®ÀÚ °¹¼ö :', count)
´ë¹®ÀÚ °¹¼ö : 46

LAB 9-4 : 1회용 패스워드를 만들어 보자

In [46]:
import random 

n_digits = int(input('¸î ÀÚ¸®ÀÇ ºñ¹Ð¹øÈ£¸¦ ¿øÇϽʴϱî? '))
               
otp = '' 
for i in range(n_digits) : 
      otp += str(random.randrange(0, 10))
      
print(otp)
¸î ÀÚ¸®ÀÇ ºñ¹Ð¹øÈ£¸¦ ¿øÇϽʴϱî? 6
064898
In [47]:
!pip install wordcloud wikipedia
Collecting wordcloud
  Downloading wordcloud-1.8.1-cp37-cp37m-win_amd64.whl (154 kB)
Collecting wikipedia
  Using cached wikipedia-1.4.0.tar.gz (27 kB)
Requirement already satisfied: pillow in c:\users\shjung\anaconda3\lib\site-packages (from wordcloud) (7.0.0)
Requirement already satisfied: matplotlib in c:\users\shjung\anaconda3\lib\site-packages (from wordcloud) (3.1.1)
Requirement already satisfied: numpy>=1.6.1 in c:\users\shjung\anaconda3\lib\site-packages (from wordcloud) (1.18.1)
Requirement already satisfied: beautifulsoup4 in c:\users\shjung\anaconda3\lib\site-packages (from wikipedia) (4.8.2)
Requirement already satisfied: requests<3.0.0,>=2.0.0 in c:\users\shjung\anaconda3\lib\site-packages (from wikipedia) (2.22.0)
Requirement already satisfied: cycler>=0.10 in c:\users\shjung\anaconda3\lib\site-packages (from matplotlib->wordcloud) (0.10.0)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\shjung\anaconda3\lib\site-packages (from matplotlib->wordcloud) (1.1.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in c:\users\shjung\anaconda3\lib\site-packages (from matplotlib->wordcloud) (2.4.6)
Requirement already satisfied: python-dateutil>=2.1 in c:\users\shjung\anaconda3\lib\site-packages (from matplotlib->wordcloud) (2.8.1)
Requirement already satisfied: soupsieve>=1.2 in c:\users\shjung\anaconda3\lib\site-packages (from beautifulsoup4->wikipedia) (1.9.5)
Requirement already satisfied: idna<2.9,>=2.5 in c:\users\shjung\anaconda3\lib\site-packages (from requests<3.0.0,>=2.0.0->wikipedia) (2.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in c:\users\shjung\anaconda3\lib\site-packages (from requests<3.0.0,>=2.0.0->wikipedia) (3.0.4)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in c:\users\shjung\anaconda3\lib\site-packages (from requests<3.0.0,>=2.0.0->wikipedia) (1.25.8)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\shjung\anaconda3\lib\site-packages (from requests<3.0.0,>=2.0.0->wikipedia) (2021.10.8)
Requirement already satisfied: six in c:\users\shjung\anaconda3\lib\site-packages (from cycler>=0.10->matplotlib->wordcloud) (1.14.0)
Requirement already satisfied: setuptools in c:\users\shjung\anaconda3\lib\site-packages (from kiwisolver>=1.0.1->matplotlib->wordcloud) (45.1.0.post20200127)
Building wheels for collected packages: wikipedia
  Building wheel for wikipedia (setup.py): started
  Building wheel for wikipedia (setup.py): finished with status 'done'
  Created wheel for wikipedia: filename=wikipedia-1.4.0-py3-none-any.whl size=11691 sha256=bea1c4272505244cf8f800fabf37c1b6de493a4537e09bb37b1c9d208d3fc884
  Stored in directory: c:\users\shjung\appdata\local\pip\cache\wheels\15\93\6d\5b2c68b8a64c7a7a04947b4ed6d89fb557dcc6bc27d1d7f3ba
Successfully built wikipedia
Installing collected packages: wordcloud, wikipedia
Successfully installed wikipedia-1.4.0 wordcloud-1.8.1
In [48]:
import wikipedia

# Specify the title of the Wikipedia page
wiki = wikipedia.page('Artificial intelligence')
# Extract the plain text content of the page
text = wiki.content
In [49]:
from wordcloud import WordCloud

# Generate word cloud
wordcloud = WordCloud(width = 2000, height = 1500).generate(text)
In [50]:
import matplotlib.pyplot as plt
plt.figure(figsize=(40, 30))
# Display image
plt.imshow(wordcloud) 
plt.show()