생능출판사 (가칭)"데이터과학 파이썬" 코드 9장

9.2 문자열에서 개별 문자들을 뽑아보자

In [1]:
s = 'Monty Python'
s[0]
Out[1]:
'M'
In [2]:
s[6:10]
Out[2]:
'Pyth'
In [3]:
s[-12:-7]
Out[3]:
'Monty'
In [4]:
t = s[:-2]
t
Out[4]:
'Monty Pyth'
In [5]:
t = s[-2:]
t
Out[5]:
'on'
In [6]:
s[:-2] + s[-2:]
Out[6]:
'Monty Python'

9.3 문자열을 분해해 보자

In [7]:
s = 'Welcome to Python' 
s.split()
Out[7]:
['Welcome', 'to', 'Python']
In [8]:
s = '2021.8.15' 
s.split('.')
Out[8]:
['2021', '8', '15']
In [9]:
s = 'Hello, World!' 
s.split(",")
Out[9]:
['Hello', ' World!']
In [10]:
s = 'Hello, World!' 
s.split(', ')
Out[10]:
['Hello', 'World!']
In [11]:
s = 'Welcome, to,  Python, and ,  bla, bla   '
[x.strip() for x in s.split(',')]
Out[11]:
['Welcome', 'to', 'Python', 'and', 'bla', 'bla']
In [12]:
list('Hello, World!')
Out[12]:
['H', 'e', 'l', 'l', 'o', ',', ' ', 'W', 'o', 'r', 'l', 'd', '!']

9.4 문자열을 이어붙이는 것은 파이썬한테는 쉬운 일

In [13]:
','.join(['apple', 'grape', 'banana'])
Out[13]:
'apple,grape,banana'
In [14]:
'-'.join('010.1234.5678'.split('.'))  # .À¸·Î ±¸ºÐµÈ ÀüÈ­¹øÈ£¸¦ ÇÏÀÌÇÂÀ¸·Î °íÄ¡±â
Out[14]:
'010-1234-5678'
In [15]:
'010.1234.5678'.replace('.','-')
Out[15]:
'010-1234-5678'
In [16]:
s = 'hello world'
clist = list(s)
clist
Out[16]:
['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd']
In [17]:
''.join(clist)
Out[17]:
'hello world'
In [18]:
a_string = 'Actions \n\t speak louder than words'
a_string
Out[18]:
'Actions \n\t speak louder than words'
In [19]:
print(a_string)
Actions 
	 speak louder than words
In [20]:
word_list = a_string.split()
word_list
Out[20]:
['Actions', 'speak', 'louder', 'than', 'words']
In [21]:
refined_string = " ".join(word_list)    # a_string¿¡ ÀÖ´Â ÁٹٲÞ, Åǹ®ÀÚ°¡ ¾ø¾îÁø´Ù
print(refined_string)
Actions speak louder than words

9.5 대문자와 소문자 변환, 그리고 문자열 삭제

In [22]:
s = 'Hello, World!'
s.lower()
Out[22]:
'hello, world!'
In [23]:
s.upper()
Out[23]:
'HELLO, WORLD!'
In [24]:
s = "   Hello, World!   "
s.strip() 
Out[24]:
'Hello, World!'
In [25]:
s.lstrip() 
Out[25]:
'Hello, World!   '
In [26]:
s.rstrip() 
Out[26]:
'   Hello, World!'
In [27]:
s = "########this is an example#####"
s.strip('#')
Out[27]:
'this is an example'
In [28]:
s = "########this is an example#####"
s.lstrip('#')
Out[28]:
'this is an example#####'
In [29]:
s.rstrip('#')
Out[29]:
'########this is an example'
In [30]:
s.strip('#').capitalize()
Out[30]:
'This is an example'
In [31]:
s = "www.booksr.co.kr" 
s.find(".kr")
Out[31]:
13
In [32]:
s.find("x")    # 'x' ¹®ÀÚ¿­ÀÌ ¾øÀ» °æ¿ì -1À» ¹ÝȯÇÔ
Out[32]:
-1

9.6 다양한 문자열 처리 함수와 string 모듈

In [33]:
s = 'www.booksr.co.kr'    # »ý´ÉÃâÆÇ»çÀÇ È¨ÆäÀÌÁö
s.count('.')              # . ÀÌ ¸î¹ø ³ªÅ¸³ª´Â°¡¸¦ ¾Ë·ÁÁØ´Ù
Out[33]:
3
In [34]:
s = 'www.booksr.co.kr' 
ord(max(s))   # s¹®ÀÚ¿­ ³»¿¡¼­ À¯´ÏÄÚµå °ªÀÌ °¡Àå Å« °ªÀÇ À¯´ÏÄÚµå °ªÀ» ¹Ýȯ
Out[34]:
119
In [35]:
ord(min(s))   # s¹®ÀÚ¿­ ³»¿¡¼­ À¯´ÏÄÚµå °ªÀÌ °¡Àå ÀÛÀº °ªÀÇ À¯´ÏÄÚµå °ªÀ» ¹Ýȯ
Out[35]:
46
In [36]:
chr(119), chr(46)  # À¯´ÏÄÚµå °ª 119, 46¿¡ ÇØ´çÇÏ´Â ¹®ÀÚ¸¦ ¹Ýȯ
Out[36]:
('w', '.')
In [37]:
import string
src_str = string.ascii_uppercase
print('src_str =', src_str)
src_str = ABCDEFGHIJKLMNOPQRSTUVWXYZ
In [38]:
src_str = string.ascii_uppercase
dst_str = src_str[1:] + src_str[:1]
print('dst_str =', dst_str)
dst_str = BCDEFGHIJKLMNOPQRSTUVWXYZA
In [39]:
n = src_str.index('A')
print('src_strÀÇ A À妽º =', n)
print('src_strÀÇ A À§Ä¡¿¡ ÀÖ´Â dst_strÀÇ ¹®ÀÚ =', dst_str[n])
src_strÀÇ A À妽º = 0
src_strÀÇ A À§Ä¡¿¡ ÀÖ´Â dst_strÀÇ ¹®ÀÚ = B

LAB 9-1 : 카이사르 암호를 만들어 보자

In [40]:
import string

src_str = string.ascii_uppercase
dst_str = src_str[3:] + src_str[:3]

def ciper(a):          # ¾Ïȣȭ Äڵ带 ¸¸µå´Â ÇÔ¼Ò
    idx = src_str.index(a)
    return dst_str[idx]

src = input('¹®ÀåÀ» ÀÔ·ÂÇϽÿÀ: ')
print('¾ÏȣȭµÈ ¹®Àå : ', end='')

for ch in src:
    if ch in src_str:
        print(ciper(ch), end='')
    else:
       print(ch, end='')
  
print()
¹®ÀåÀ» ÀÔ·ÂÇϽÿÀ: ATTACK ON MIDNIGHT
¾ÏȣȭµÈ ¹®Àå : DWWDFN RQ PLGQLJKW

LAB 9-2 : 트위터 메시지 처리의 단어 추출

In [41]:
t = "There's a reason some people are working to make it harder to vote, especially for people of color. It¡¯s because when we show up, things change."

length = len(t.split(" "))
print('word count:', length)
word count: 26

LAB 9-3 : 트위터 메시지의 대문자, 소문자 변환

In [42]:
t = "It's Not The Right Time To Conduct Exams. MY DEMAND IN BOLD AND CAPITAL. NO EXAMS IN COVID!!!"
t
Out[42]:
"It's Not The Right Time To Conduct Exams. MY DEMAND IN BOLD AND CAPITAL. NO EXAMS IN COVID!!!"
In [43]:
t = "It's Not The Right Time To Conduct Exams. MY DEMAND IN BOLD AND CAPITAL. NO EXAMS IN COVID!!!"
l = t.lower()
l
Out[43]:
"it's not the right time to conduct exams. my demand in bold and capital. no exams in covid!!!"

도전문제 9.3

  • Æ®À­ µ¥ÀÌÅÍ¿¡¼­ ´ë¹®ÀÚ³ª ´À³¦Ç¥°¡ ¸¹ÀÌ ³ªÅ¸³ª´Â °ÍÀº ±Û¾²´Â »ç¶÷ÀÇ °¨Á¤ÀÌ ÈïºÐÇϰųª ºÐ³ëÇÑ »óÅÂÀÓÀ» ³ªÅ¸³»´Â °æ¿ì°¡ ¸¹´Ù. ÁÖ¾îÁø ¿øõ Æ®À­ µ¥ÀÌÅÍ¿¡¼­ ´ë¹®ÀÚ¿Í ´À³¦Ç¥°¡ ¸î ¹ø »ç¿ëµÇ¾ú´ÂÁö °è»êÇÏ´Â Äڵ带 ÀÛ¼ºÇØ º¸¶ó. ´À³¦Ç¥°¡ ¸î °³ÀÎÁö Çì¾Æ¸®´Â °ÍÀº count() ÇÔ¼ö¸¦ »ç¿ëÇÏ¸é ½±°Ô ÇÒ ¼ö ÀÖ´Ù. ´ë¹®ÀÚ´Â ¾î¶»°Ô ¼¿ ¼ö ÀÖÀ»±î? ¿ø Æ®À­À» list()¸¦ ÀÌ¿ëÇÏ¿© Çϳª ÇϳªÀÇ ¹®ÀÚ·Î ºÐ¸®ÇÏ¿© ¸®½ºÆ®¸¦ ¸¸µé ¼ö ÀÖ´Ù. ±×¸®°í ÀÌ ¸®½ºÆ®ÀÇ °¢ Ç׸ñ ¹®ÀÚ°¡ ch¶ó°í ÇÒ ¶§, ch.isupper()¸¦ È£ÃâÇÏ¸é ´ë¹®ÀÚÀÎ °æ¿ì True°¡ ¹ÝȯµÈ´Ù.
In [44]:
t_lst = list(t)
print('´À³¦Ç¥ °¹¼ö :', t_lst.count('!'))
´À³¦Ç¥ °¹¼ö : 3
In [45]:
count = 0
for ch in t_lst:
  if ch.isupper() == True:
    count += 1

print('´ë¹®ÀÚ °¹¼ö :', count)
´ë¹®ÀÚ °¹¼ö : 46

LAB 9-4 : 1회용 패스워드를 만들어 보자

In [46]:
import random 

n_digits = int(input('¸î ÀÚ¸®ÀÇ ºñ¹Ð¹øÈ£¸¦ ¿øÇϽʴϱî? '))
               
otp = '' 
for i in range(n_digits) : 
      otp += str(random.randrange(0, 10))
      
print(otp)
¸î ÀÚ¸®ÀÇ ºñ¹Ð¹øÈ£¸¦ ¿øÇϽʴϱî? 6
064898
In [47]:
!pip install wordcloud wikipedia
Collecting wordcloud
  Downloading wordcloud-1.8.1-cp37-cp37m-win_amd64.whl (154 kB)
Collecting wikipedia
  Using cached wikipedia-1.4.0.tar.gz (27 kB)
Requirement already satisfied: pillow in c:\users\shjung\anaconda3\lib\site-packages (from wordcloud) (7.0.0)
Requirement already satisfied: matplotlib in c:\users\shjung\anaconda3\lib\site-packages (from wordcloud) (3.1.1)
Requirement already satisfied: numpy>=1.6.1 in c:\users\shjung\anaconda3\lib\site-packages (from wordcloud) (1.18.1)
Requirement already satisfied: beautifulsoup4 in c:\users\shjung\anaconda3\lib\site-packages (from wikipedia) (4.8.2)
Requirement already satisfied: requests<3.0.0,>=2.0.0 in c:\users\shjung\anaconda3\lib\site-packages (from wikipedia) (2.22.0)
Requirement already satisfied: cycler>=0.10 in c:\users\shjung\anaconda3\lib\site-packages (from matplotlib->wordcloud) (0.10.0)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\shjung\anaconda3\lib\site-packages (from matplotlib->wordcloud) (1.1.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in c:\users\shjung\anaconda3\lib\site-packages (from matplotlib->wordcloud) (2.4.6)
Requirement already satisfied: python-dateutil>=2.1 in c:\users\shjung\anaconda3\lib\site-packages (from matplotlib->wordcloud) (2.8.1)
Requirement already satisfied: soupsieve>=1.2 in c:\users\shjung\anaconda3\lib\site-packages (from beautifulsoup4->wikipedia) (1.9.5)
Requirement already satisfied: idna<2.9,>=2.5 in c:\users\shjung\anaconda3\lib\site-packages (from requests<3.0.0,>=2.0.0->wikipedia) (2.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in c:\users\shjung\anaconda3\lib\site-packages (from requests<3.0.0,>=2.0.0->wikipedia) (3.0.4)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in c:\users\shjung\anaconda3\lib\site-packages (from requests<3.0.0,>=2.0.0->wikipedia) (1.25.8)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\shjung\anaconda3\lib\site-packages (from requests<3.0.0,>=2.0.0->wikipedia) (2021.10.8)
Requirement already satisfied: six in c:\users\shjung\anaconda3\lib\site-packages (from cycler>=0.10->matplotlib->wordcloud) (1.14.0)
Requirement already satisfied: setuptools in c:\users\shjung\anaconda3\lib\site-packages (from kiwisolver>=1.0.1->matplotlib->wordcloud) (45.1.0.post20200127)
Building wheels for collected packages: wikipedia
  Building wheel for wikipedia (setup.py): started
  Building wheel for wikipedia (setup.py): finished with status 'done'
  Created wheel for wikipedia: filename=wikipedia-1.4.0-py3-none-any.whl size=11691 sha256=bea1c4272505244cf8f800fabf37c1b6de493a4537e09bb37b1c9d208d3fc884
  Stored in directory: c:\users\shjung\appdata\local\pip\cache\wheels\15\93\6d\5b2c68b8a64c7a7a04947b4ed6d89fb557dcc6bc27d1d7f3ba
Successfully built wikipedia
Installing collected packages: wordcloud, wikipedia
Successfully installed wikipedia-1.4.0 wordcloud-1.8.1
In [48]:
import wikipedia

# Specify the title of the Wikipedia page
wiki = wikipedia.page('Artificial intelligence')
# Extract the plain text content of the page
text = wiki.content
In [49]:
from wordcloud import WordCloud

# Generate word cloud
wordcloud = WordCloud(width = 2000, height = 1500).generate(text)
In [50]:
import matplotlib.pyplot as plt
plt.figure(figsize=(40, 30))
# Display image
plt.imshow(wordcloud) 
plt.show()
In [51]:
import wikipedia
import matplotlib.pyplot as plt
from wordcloud import WordCloud, STOPWORDS

# Specify the title of the Wikipedia page
wiki = wikipedia.page('Artificial intelligence')
# Extract the plain text content of the page
text = wiki.content

# Generate word cloud
s_words = STOPWORDS.union( {'one', 'using', 'first', 'two', 'make', 'use'} )
wordcloud = WordCloud(width = 2000, height = 1500, 
                      stopwords = s_words).generate(text)

plt.figure(figsize=(40, 30))
# Display image
plt.imshow(wordcloud) 
plt.show()

도전문제 9.4

  • ÀÌÁ¦ Python°ú Machine LearningÀ» ÀÌ¿ëÇÏ¿© ¿öµå Ŭ¶ó¿ìµå¸¦ °¢°¢ ¸¸µé¾î º¸¾Æ¶ó.
In [52]:
import wikipedia
import matplotlib.pyplot as plt
from wordcloud import WordCloud, STOPWORDS

# Specify the title of the Wikipedia page
wiki = wikipedia.page('Python')
# Extract the plain text content of the page
text = wiki.content

# Generate word cloud
s_words = STOPWORDS.union( {'one', 'using', 'first', 'two', 'make', 'use'} )
wordcloud = WordCloud(width = 2000, height = 1500, 
                      stopwords = s_words).generate(text)

plt.figure(figsize=(40, 30))
# Display image
plt.imshow(wordcloud) 
plt.show()

9.9 규칙을 이용해서 문자를 추출하는 멋진 정규식 표현법

In [55]:
import re

txt1 = "Life is too short, you need python."
txt2 = "The best moments of my life."
print(re.search('Life', txt1))    # ¹®Àå ¾È¿¡ Life°¡ Àִ°¡ °Ë»çÇÔ
<re.Match object; span=(0, 4), match='Life'>
In [56]:
print(re.search('Life', txt2))    # ¹®Àå ¾È¿¡ Life°¡ Àִ°¡ °Ë»çÇÔ
None
In [57]:
match = re.search('Life', txt1)
match.group()
Out[57]:
'Life'
In [58]:
match.start()
Out[58]:
0
In [59]:
match.end()
Out[59]:
4
In [60]:
match.span()
Out[60]:
(0, 4)
In [61]:
print(re.search('Life|life', txt2))    # ¹®Àå ¾È¿¡ Life°¡ Àִ°¡ °Ë»çÇÔ
<re.Match object; span=(23, 27), match='life'>
In [62]:
print(re.search('[Ll]ife', txt2))    # ¹®Àå ¾È¿¡ Life ȤÀº life°¡ Àִ°¡ °Ë»çÇÔ
<re.Match object; span=(23, 27), match='life'>
In [63]:
txt1 = "Life is too short, you need python."
txt2 = "The best moments of my life."
txt3 = "My Life My Choice."

print(re.search('^Life', txt1))    # Á¦ÀÏ Ã¹ ´Ü¾î·Î Life°¡ Àִ°¡ °Ë»çÇÔ
<re.Match object; span=(0, 4), match='Life'>
In [64]:
print(re.search('^Life', txt2))    # Á¦ÀÏ Ã¹ ´Ü¾î·Î Life°¡ Àִ°¡ °Ë»çÇÔ
None
In [65]:
print(re.search('^Life', txt3))    # Á¦ÀÏ Ã¹ ´Ü¾î·Î Life°¡ Àִ°¡ °Ë»çÇÔ
None
In [66]:
print(re.search('Life', txt3))    # Á¦ÀÏ Ã¹ ´Ü¾î·Î Life°¡ Àִ°¡ °Ë»çÇÔ
<re.Match object; span=(3, 7), match='Life'>
In [67]:
re.search('My', txt3)
Out[67]:
<re.Match object; span=(0, 2), match='My'>
In [68]:
txt3 = "Life is like a box of chocolates."
txt4 = "My Life My Choice."
In [69]:
print(re.search('^Life', txt1))    # Á¦ÀÏ Ã¹ ´Ü¾î·Î Life°¡ Àִ°¡ °Ë»çÇÔ
print(re.search('^Life', txt2))    # Á¦ÀÏ Ã¹ ´Ü¾î·Î Life°¡ Àִ°¡ °Ë»çÇÔ
print(re.search('^Life', txt3))    # Á¦ÀÏ Ã¹ ´Ü¾î·Î Life°¡ Àִ°¡ °Ë»çÇÔ
print(re.search('^Life', txt4))    # Á¦ÀÏ Ã¹ ´Ü¾î·Î Life°¡ Àִ°¡ °Ë»çÇÔ
<re.Match object; span=(0, 4), match='Life'>
None
<re.Match object; span=(0, 4), match='Life'>
None
In [70]:
print(re.search('Life|life', txt1))   # Life ȤÀº life°¡ Æ÷ÇԵǾî Àִ°¡ °Ë»çÇÔ
print(re.search('Life|life', txt2))   # Life ȤÀº life°¡ Æ÷ÇԵǾî Àִ°¡ °Ë»çÇÔ
<re.Match object; span=(0, 4), match='Life'>
<re.Match object; span=(23, 27), match='life'>
In [71]:
print(re.search('[Ll]ife', txt1))   # Life ȤÀº life°¡ Æ÷ÇԵǾî Àִ°¡ °Ë»çÇÔ
print(re.search('[Ll]ife', txt2))   # Life ȤÀº life°¡ Æ÷ÇԵǾî Àִ°¡ °Ë»çÇÔ
<re.Match object; span=(0, 4), match='Life'>
<re.Match object; span=(23, 27), match='life'>
In [72]:
>>> re.findall('My', txt4)
Out[72]:
['My', 'My']
In [73]:
import re
txt1 = 'Who are you to judge the life I live'
txt2 = 'The best moments of my life'
print(re.search('life$', txt1))   # Life ȤÀº life°¡ Æ÷ÇԵǾî Àִ°¡ °Ë»çÇÔ
None
In [74]:
print(re.search('life$', txt2))   # Life ȤÀº life°¡ Æ÷ÇԵǾî Àִ°¡ °Ë»çÇÔ
<re.Match object; span=(23, 27), match='life'>

9.9 메타 문자를 좀 더 깊이 알아보자

In [75]:
import re

re.search('A..A', 'ABA')      # Á¶°Ç¿¡ ¸ÂÁö ¾ÊÀ½
In [76]:
re.search('A..A', 'ABBA')     # Á¶°Ç¿¡ ¸ÂÀ½
Out[76]:
<re.Match object; span=(0, 4), match='ABBA'>
In [77]:
re.search('A..A', 'ABBBA')    # Á¶°Ç¿¡ ¸ÂÁö ¾ÊÀ½
In [ ]:
 

* 메타문자

In [78]:
re.search('AB*', 'A')    # Á¶°Ç¿¡ ¸ÂÀ½
Out[78]:
<re.Match object; span=(0, 1), match='A'>
In [79]:
re.search('AB*', 'AA')    # Á¶°Ç¿¡ ¸ÂÀ½
Out[79]:
<re.Match object; span=(0, 1), match='A'>
In [80]:
re.search('AB*', 'J-HOP')   # Á¶°Ç¿¡ ¸ÂÁö ¾ÊÀ½
In [81]:
re.search('AB*', 'X-MAN')   # Á¶°Ç¿¡ ¸ÂÀ½
Out[81]:
<re.Match object; span=(3, 4), match='A'>
In [82]:
re.search('AB*', 'CABBA')      # Á¶°Ç¿¡ ¸ÂÀ½
Out[82]:
<re.Match object; span=(1, 4), match='ABB'>
In [83]:
re.search('AB*', 'CABBBBBA')      # Á¶°Ç¿¡ ¸ÂÀ½
Out[83]:
<re.Match object; span=(1, 7), match='ABBBBB'>

? 메타문자

In [84]:
re.search('AB?', 'A')    # Á¶°Ç¿¡ ¸ÂÀ½
Out[84]:
<re.Match object; span=(0, 1), match='A'>
In [85]:
re.search('AB?', 'AA')    # Á¶°Ç¿¡ ¸ÂÀ½
Out[85]:
<re.Match object; span=(0, 1), match='A'>
In [86]:
re.search('AB?', 'J-HOP')   # Á¶°Ç¿¡ ¸ÂÁö ¾ÊÀ½
In [87]:
re.search('AB?', 'X-MAN')   # Á¶°Ç¿¡ ¸ÂÀ½
Out[87]:
<re.Match object; span=(3, 4), match='A'>
In [88]:
re.search('AB?', 'CABBA')      # Á¶°Ç¿¡ ¸ÂÀ½
Out[88]:
<re.Match object; span=(1, 3), match='AB'>
In [89]:
re.search('AB?', 'CABBBBBA')      # Á¶°Ç¿¡ ¸ÂÀ½
Out[89]:
<re.Match object; span=(1, 3), match='AB'>

+ 메타문

In [90]:
re.search('AB+', 'A')       # Á¶°Ç¿¡ ¸ÂÁö ¾ÊÀ½
In [91]:
re.search('AB+', 'AA')      # Á¶°Ç¿¡ ¸ÂÁö ¾ÊÀ½
In [92]:
re.search('AB+', 'J-HOP')   # Á¶°Ç¿¡ ¸ÂÁö ¾ÊÀ½
In [93]:
re.search('AB+', 'X-MAN')   # Á¶°Ç¿¡ ¸ÂÁö ¾ÊÀ½
In [94]:
re.search('AB+', 'CABBA')      # 'ABB'¶ó´Â ¹®ÀÚ¿­ÀÌ Á¶°Ç¿¡ ¸ÂÀ½
Out[94]:
<re.Match object; span=(1, 4), match='ABB'>
In [95]:
re.search('AB+', 'CABBBBBA')   # 'ABBBBB'¶ó´Â ¹®ÀÚ¿­ÀÌ Á¶°Ç¿¡ ¸ÂÀ½
Out[95]:
<re.Match object; span=(1, 7), match='ABBBBB'>
In [96]:
txt3 = 'My life my life my life in the sunshine'
re.findall('[Mm]y', txt3)
Out[96]:
['My', 'my', 'my']
In [98]:
import re 

f = open('./UNDHR.txt') 

for line in f: 
    line = line.rstrip()
    if re.search('^\([0-9]+\)', line) : 
        print(line) 
(1) Everyone charged with a penal offence has the right to be presumed innocent until proved guilty according to law in a public trial at which he has had all the guarantees necessary for his defence.
(2) No one shall be held guilty of any penal offence on account of any act or omission which did not constitute a penal offence, under national or international law, at the time when it was committed. Nor shall a heavier penalty be imposed than the one that was applicable at the time the penal offence was committed.
(1) Everyone has the right to freedom of movement and residence within the borders of each state.
(2) Everyone has the right to leave any country, including his own, and to return to his country.
(1) Everyone has the right to seek and to enjoy in other countries asylum from persecution.
(2) This right may not be invoked in the case of prosecutions genuinely arising from non-political crimes or from acts contrary to the purposes and principles of the United Nations.
(1) Everyone has the right to a nationality.
(2) No one shall be arbitrarily deprived of his nationality nor denied the right to change his nationality.
(1) Men and women of full age, without any limitation due to race, nationality or religion, have the right to marry and to found a family. They are entitled to equal rights as to marriage, during marriage and at its dissolution.
(2) Marriage shall be entered into only with the free and full consent of the intending spouses.
(3) The family is the natural and fundamental group unit of society and is entitled to protection by society and the State.
(1) Everyone has the right to own property alone as well as in association with others.
(2) No one shall be arbitrarily deprived of his property.
(1) Everyone has the right to freedom of peaceful assembly and association.
(2) No one may be compelled to belong to an association.
(1) Everyone has the right to take part in the government of his country, directly or through freely chosen representatives.
(2) Everyone has the right of equal access to public service in his country.
(3) The will of the people shall be the basis of the authority of government; this will shall be expressed in periodic and genuine elections which shall be by universal and equal suffrage and shall be held by secret vote or by equivalent free voting procedures.
(1) Everyone has the right to work, to free choice of employment, to just and favourable conditions of work and to protection against unemployment.
(2) Everyone, without any discrimination, has the right to equal pay for equal work.
(3) Everyone who works has the right to just and favourable remuneration ensuring for himself and his family an existence worthy of human dignity, and supplemented, if necessary, by other means of social protection.
(4) Everyone has the right to form and to join trade unions for the protection of his interests.
(1) Everyone has the right to a standard of living adequate for the health and well-being of himself and of his family, including food, clothing, housing and medical care and necessary social services, and the right to security in the event of unemployment, sickness, disability, widowhood, old age or other lack of livelihood in circumstances beyond his control.
(2) Motherhood and childhood are entitled to special care and assistance. All children, whether born in or out of wedlock, shall enjoy the same social protection.
(1) Everyone has the right to education. Education shall be free, at least in the elementary and fundamental stages. Elementary education shall be compulsory. Technical and professional education shall be made generally available and higher education shall be equally accessible to all on the basis of merit.
(2) Education shall be directed to the full development of the human personality and to the strengthening of respect for human rights and fundamental freedoms. It shall promote understanding, tolerance and friendship among all nations, racial or religious groups, and shall further the activities of the United Nations for the maintenance of peace.
(3) Parents have a prior right to choose the kind of education that shall be given to their children.
(1) Everyone has the right freely to participate in the cultural life of the community, to enjoy the arts and to share in scientific advancement and its benefits.
(2) Everyone has the right to the protection of the moral and material interests resulting from any scientific, literary or artistic production of which he is the author.
(1) Everyone has duties to the community in which alone the free and full development of his personality is possible.
(2) In the exercise of his rights and freedoms, everyone shall be subject only to such limitations as are determined by law solely for the purpose of securing due recognition and respect for the rights and freedoms of others and of meeting the just requirements of morality, public order and the general welfare in a democratic society.
(3) These rights and freedoms may in no case be exercised contrary to the purposes and principles of the United Nations.

LAB 9-5 : 학사 코드 추출하기에 도전하자

In [99]:
import re 

# ¸ÖƼ¶óÀÎ ÅؽºÆ®´Â ¼¼ °³ÀÇ µû¿ÈÇ¥¸¦ »ç¿ëÇÏ¿© Ç¥ÇöÇÑ´Ù
text="""101 COM PythonProgramming 
102 MAT LinearAlgebra 
103 ENG ComputerEnglish""" 
 
s = re.findall('\d+', text) 
print(s)
['101', '102', '103']
In [100]:
import re 
 
txt = "abc@facebook.com¿Í bbc@google.com¿¡¼­ À̸ÞÀÏÀÌ µµÂøÇÏ¿´½À´Ï´Ù."
output = re.findall('\S+@[a-z.]+', txt)
print('ÃßÃâµÈ À̸ÞÀÏ :', output)
ÃßÃâµÈ À̸ÞÀÏ : ['abc@facebook.com', 'bbc@google.com']
In [101]:
import re 

while True: 
    password = input("Æнº¿öµå¸¦ ÀÔ·ÂÇϼ¼¿ä : "); 
    if len(password)<8 or not re.search("([a-z])([A-Z])", password) \
        or not re.search("[0-9]", password) \
        or not re.search("[_@$]", password): 
        None
    else: 
        print("À¯È¿ÇÑ Æнº¿öµå") 
        break
Æнº¿öµå¸¦ ÀÔ·ÂÇϼ¼¿ä : IamHappy@$
Æнº¿öµå¸¦ ÀÔ·ÂÇϼ¼¿ä : IamHappy80@$
À¯È¿ÇÑ Æнº¿öµå
In [102]:
import re 

while True: 
    password = input("Æнº¿öµå¸¦ ÀÔ·ÂÇϼ¼¿ä : "); 
    if len(password)<8 or not re.search("[a-z]", password) or \
        not re.search("[A-Z]", password) or \
        not re.search("[0-9]", password) or not re.search("[_@$!]", password): 
        print("À¯È¿ÇÏÁö ¾ÊÀº Æнº¿öµå!") 
    else: 
        print("À¯È¿ÇÑ Æнº¿öµå") 
        break
Æнº¿öµå¸¦ ÀÔ·ÂÇϼ¼¿ä : IamHappy
À¯È¿ÇÏÁö ¾ÊÀº Æнº¿öµå!
Æнº¿öµå¸¦ ÀÔ·ÂÇϼ¼¿ä : IamHappy@
À¯È¿ÇÏÁö ¾ÊÀº Æнº¿öµå!
Æнº¿öµå¸¦ ÀÔ·ÂÇϼ¼¿ä : IamHappy88_
À¯È¿ÇÑ Æнº¿öµå

9.11 정규식에서 특정 문자를 대체하는 함수 : sub()

In [103]:
import re

s = 'I like BTS!'
re.sub('BTS', 'Black Pink', s)
Out[103]:
'I like Black Pink!'
In [104]:
s = 'My lucky number 2 7 99'
re.sub('[0-9]+', '*', s)     # ¼ýÀÚ¸¸ ã¾Æ¼­ *À¸·Î ¹Ù²Þ
Out[104]:
'My lucky number * * *'
In [105]:
re.sub('\d+', '*', s)     # ¼ýÀÚ¸¸ ã¾Æ¼­ *À¸·Î ¹Ù²Þ
Out[105]:
'My lucky number * * *'
In [106]:
def hash_by_mult_and_modulo(m):        # ¸Å°³º¯¼ö·Î ¸ÅÄ¡ °´Ã¼¸¦ ¹ÞÀ½
    n = int(m.group())    # ¸ÅĪµÈ ¹®ÀÚ¿­À» °¡Á®¿Í¼­ Á¤¼ö·Î º¯È¯
    return str(n * 23435 % 973)    # ¼ýÀÚ¿¡ 10À» °öÇÑ µÚ ¹®ÀÚ¿­·Î º¯È¯Çؼ­ ¹Ýȯ

print(re.sub('[0-9]+', hash_by_mult_and_modulo, s))
My lucky number 166 581 433

LAB 9-8 : 트윗 메시지를 깔끔하게 정제하자

In [107]:
import re 
tweet = input('Æ®À­À» ÀÔ·ÂÇϽÿÀ: ') 
tweet = re.sub('RT', '', tweet) 
tweet = re.sub('#\S+', '', tweet) 
tweet = re.sub('@\S+', '', tweet) 
print(tweet)
Æ®À­À» ÀÔ·ÂÇϽÿÀ: I am Happy
I am Happy
In [ ]: