https://www.youtube.com/watch?v=efWlOCE_6HY&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI

RNN W1L01 : Why Sequence Models

RNN의 사용예시를 보여준다.

image


https://www.youtube.com/watch?v=XeQN82D4bCQ&index=2&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI

RNN W1L02 : Notation

image

(i) 는 example number이다. <t> 는 한 example에서 몇번째 단어인지를 나타낸다. T (i) x 는 i example에서의 총 단어수 이다.

image

30000 ~ 50000 vocabulary 수가 일반적인 비즈니스에서 사용된다. 때때로 1 mil이 사용되기도 한다. vector옆 숫자는 vocabulary 목록에서의 인덱스 번호이다. 


https://www.youtube.com/watch?v=2E65LDnM2cA&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI&index=3

RNN W1L03 : RNN Model

image

위 그림은 왜 일반 neural network를 사용할 수 없는지를 설명한다.

image

오른쪽 다이어그램과 왼쪽 다이어그램은 같은 내용이다. 다만 왼쪽은 과정을 풀어서 설명한 것이다.waa는 전단계의 a값에 곱해져서 이번단계의 계산에 사용되는 weight이며 a1, a2, a3…. 등에 쓰이며 모두 같은 값이다. wax는 입력값에 적용되는 weights이며 여러군데에서 사용되는데 모두 같은 값이다. wya는 해당 layer에서 출력되는 값을 구하기 위해 사용되는 weights이며 여러군데에서 사용되는데 모두 같은 값이다. 이 강의에서 Tx와 Ty는 같다고 가정한다. 일반 RNN의 단점으로 전단계의 정보는 이번단계에서 사용될수 있지만 나중에 오는 정보를 이번 단계에서 사용할수 없다는 것을 들수 있다. 이를 해결하기 위한 방법으로 bidirectional RNN을 사용한다. 

image

a 값을 위한 g 함수로 tanh, ReLU를 사용하는 경우가 많다. y값의 경우는 sigmoid를 사용하는 경우가 많다. 

image

waa와 wax를 합쳐서 wa matrix를 만들었다. 전체적인 vectorization을 보여주고 있다.


https://www.youtube.com/watch?v=esgbmJ6SnSY&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI&index=4

RNN W1L04 : Backpropagation through time

image

빨간선인 back prop 과정을 보여준다. 가장 중요한 부분은 빨간 동그라미 부분이다. loss fucntion은 logistic regression과 같다. 각 layer에서 나오는 y hat value와 실제 값을 비교해서 얻어진 loss의 총합이 최종 cost가 된다. 이 과정을 backpropagation through time이라고 한다.


https://www.youtube.com/watch?v=G5kW3V6qHuk&index=5&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI

RNN W1L05 : Different types of RNNs

image
image
image

many to many의 경우 Tx와 Ty의 수가 같은 경우와 다른 경우가 있다. 


https://www.youtube.com/watch?v=1rOCxV0fSyM&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI&index=6

RNN W1L06 : Language Model and sequence generation

image

이 강의 에서 P() 는 () 안의 문장이 실제로 사용될수 있는 확률을 알려준다. 

image

각 단어 하나 하나를 하나 하나의 y<>로 구분한다. 이를 tokenize한다고 한다. 문장끝은 <EOS> token으로  생각한다. 문장 부호도 token으로 생각하는 경우가 있다. 기존의 vocabulary에 없는 단어의 경우는 <UNK> token으로 치환한다.  

image

우측 하단은 각각 단어별 가능 확률의 곱으로 확률이 만들어 짐을 확인한다. 두번째 단어이후로는 conditional probability인 것을 확인한다.


https://www.youtube.com/watch?v=CKrxdgqBheY&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI&index=7

RNN W1L07 : Sampling novel Sequences

image
image

character level model의 경우 더많은 자원을 필요로 하며 앞쪽의 글자의 영향력이 뒤쪽에 잘 전달되지 않는 문제가 있다. 


https://www.youtube.com/watch?v=3Hn_hEPtciQ&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI&index=8

RNN W1L08 : Vanishing gradients with RNNs

image

network가 너무 깊거나 RNN의 경우 초반부의 내용이 뒷부분에 영향을 주지 못하는 문제를 가지고 있다. 

RNN과 많은 layer를 가진 network의 경우 vanishing gradient, exploding gradient 문제에 취약하다. exploding gradient의 경우 결과중에 NAN 값으로 표현되는 경우가 많아서 인지 하기가 쉬우며 clipping gradient를 통해 손쉽게 해결이 가능하다. 그러나 vanishing gradient의 경우 찾아내기 쉽지않다. 해결 방법으로는 gated recurrent unit GRU를 사용하거나 LSTM을 사용해 해결한다. 


https://www.youtube.com/watch?v=xSCy3q2ts44&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI&index=9

RNN W1L09 : Gated Recurrent Unit GRU

image

위 그림은 일반 RNN의 모습을 보여준다.

image

왼쪽 상단의 다이어그램의 보라색 박스와 오른쪽 수식의 보라색 대괄호는 같은 내용이다. 보라색 부분을 보면 GAMMA u의 값이 0이면 전단계의 C<t-1> 그대로 유지되고 값이 1이면 새로운  tilda C<t> 의 값이 저장된다. GAMMA u 값이 스위치 역활을 하게 된다. C<t> 와 tilda C<t> 와 GAMMA u 는 모두 같은 사이즈의 vector이며 보라색 박스 안에서 * 연산은 elementwise 연산이 된다. 

image

기존의 GRU에서 좀더 확장된 알고리즘이며 보다 많이 사용된다. GAMMA r은 relevance gate이다. 이와 비슷한 역할을 하는 알고리즘으로 LSTM을 들수 있다. Full GRU 설명하는데 수식 왼쪽의 notation이 대신 사용되기도 한다. 


https://www.youtube.com/watch?v=5wh4HWWfZIY&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI&index=10

RNN W1L10 : Long Short Term Memory (LSTM)

image
image

gates 계산에 C<t-1>을 넣어서 하는 경우도 있는데 이를 peephole connection이라고 한다. LSTM가 GRU보다 먼저 개발된 알고리즘이다. LSTM은 gate 가 GRU보다 많기 때문에 자원소비가 좀더 있지만 좀더 강력하고 유연하다. LSTM은 보다 간단하기에 대형 network 제작에 좀더 유리하다. 전반적으로 LSTM을 좀더 사용하낟. 


https://www.youtube.com/watch?v=bTXGpATdKRY&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI&index=11

RNN W1L11 : Bidirectional RNN

기본 RNN의 경우 전단계의 데이터가 현단계에 적용될수 있지만 이후단계의 정보는 적용될수 없다. 이를 보완 하기 위한 방법으로 Bidirectional RNN를 사용하며 일반 RNN, LSTM, GRU에 사용될수 있다.

image
image

단하나의 순환 단계가 없는 acyclic graph 형태이다. y hat <3>를 계산하는경우 노란색의 방향으로 계산된다. x<1>  , a-><1> , x<2>  , a-><2> 와 반대방향으로 부터에서의 x<4>  , a-><4> 가 같이 들어와서 상단 수식에 의해 계산된다.  


https://www.youtube.com/watch?v=U7wN1x8zsG8&index=12&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI

RNN W1L12 : Deep RNNs

기존의 RNN보다 층이 여러층으로 늘어났다. 각 층은 세로 가로로 연결되어 있다. 

보통은 3층(가로로도 연결된 층)을 초과해서는 만들지 않는다. 하단부에 가로 세로로 연결된 층을 만들고 상단부에는 세로로만 쌓여있는 network를 만들기도 한다.  

https://www.youtube.com/watch?v=efWlOCE_6HY&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI

RNN W1L01 : Why Sequence Models

RNN의 사용예시를 보여준다.

image


https://www.youtube.com/watch?v=XeQN82D4bCQ&index=2&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI

RNN W1L02 : Notation

image

(i) 는 example number이다. <t> 는 한 example에서 몇번째 단어인지를 나타낸다. T (i) x 는 i example에서의 총 단어수 이다.

image

30000 ~ 50000 vocabulary 수가 일반적인 비즈니스에서 사용된다. 때때로 1 mil이 사용되기도 한다. vector옆 숫자는 vocabulary 목록에서의 인덱스 번호이다. 


https://www.youtube.com/watch?v=2E65LDnM2cA&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI&index=3

RNN W1L03 : RNN Model

image

위 그림은 왜 일반 neural network를 사용할 수 없는지를 설명한다.

image

오른쪽 다이어그램과 왼쪽 다이어그램은 같은 내용이다. 다만 왼쪽은 과정을 풀어서 설명한 것이다.waa는 전단계의 a값에 곱해져서 이번단계의 계산에 사용되는 weight이며 a1, a2, a3…. 등에 쓰이며 모두 같은 값이다. wax는 입력값에 적용되는 weights이며 여러군데에서 사용되는데 모두 같은 값이다. wya는 해당 layer에서 출력되는 값을 구하기 위해 사용되는 weights이며 여러군데에서 사용되는데 모두 같은 값이다. 이 강의에서 Tx와 Ty는 같다고 가정한다. 일반 RNN의 단점으로 전단계의 정보는 이번단계에서 사용될수 있지만 나중에 오는 정보를 이번 단계에서 사용할수 없다는 것을 들수 있다. 이를 해결하기 위한 방법으로 bidirectional RNN을 사용한다. 

image

a 값을 위한 g 함수로 tanh, ReLU를 사용하는 경우가 많다. y값의 경우는 sigmoid를 사용하는 경우가 많다. 

image

waa와 wax를 합쳐서 wa matrix를 만들었다. 전체적인 vectorization을 보여주고 있다.


https://www.youtube.com/watch?v=esgbmJ6SnSY&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI&index=4

RNN W1L04 : Backpropagation through time

image

빨간선인 back prop 과정을 보여준다. 가장 중요한 부분은 빨간 동그라미 부분이다. loss fucntion은 logistic regression과 같다. 각 layer에서 나오는 y hat value와 실제 값을 비교해서 얻어진 loss의 총합이 최종 cost가 된다. 이 과정을 backpropagation through time이라고 한다.


https://www.youtube.com/watch?v=G5kW3V6qHuk&index=5&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI

RNN W1L05 : Different types of RNNs

image
image
image

many to many의 경우 Tx와 Ty의 수가 같은 경우와 다른 경우가 있다. 


https://www.youtube.com/watch?v=1rOCxV0fSyM&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI&index=6

RNN W1L06 : Language Model and sequence generation

image

이 강의 에서 P() 는 () 안의 문장이 실제로 사용될수 있는 확률을 알려준다. 

image

각 단어 하나 하나를 하나 하나의 y<>로 구분한다. 이를 tokenize한다고 한다. 문장끝은 <EOS> token으로  생각한다. 문장 부호도 token으로 생각하는 경우가 있다. 기존의 vocabulary에 없는 단어의 경우는 <UNK> token으로 치환한다.  

image

우측 하단은 각각 단어별 가능 확률의 곱으로 확률이 만들어 짐을 확인한다. 두번째 단어이후로는 conditional probability인 것을 확인한다.


https://www.youtube.com/watch?v=CKrxdgqBheY&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI&index=7

RNN W1L07 : Sampling novel Sequences

image
image

character level model의 경우 더많은 자원을 필요로 하며 앞쪽의 글자의 영향력이 뒤쪽에 잘 전달되지 않는 문제가 있다. 


https://www.youtube.com/watch?v=3Hn_hEPtciQ&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI&index=8

RNN W1L08 : Vanishing gradients with RNNs

image

network가 너무 깊거나 RNN의 경우 초반부의 내용이 뒷부분에 영향을 주지 못하는 문제를 가지고 있다. 

RNN과 많은 layer를 가진 network의 경우 vanishing gradient, exploding gradient 문제에 취약하다. exploding gradient의 경우 결과중에 NAN 값으로 표현되는 경우가 많아서 인지 하기가 쉬우며 clipping gradient를 통해 손쉽게 해결이 가능하다. 그러나 vanishing gradient의 경우 찾아내기 쉽지않다. 해결 방법으로는 gated recurrent unit GRU를 사용하거나 LSTM을 사용해 해결한다. 


https://www.youtube.com/watch?v=xSCy3q2ts44&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI&index=9

RNN W1L09 : Gated Recurrent Unit GRU

image

위 그림은 일반 RNN의 모습을 보여준다.

image

왼쪽 상단의 다이어그램의 보라색 박스와 오른쪽 수식의 보라색 대괄호는 같은 내용이다. 보라색 부분을 보면 GAMMA u의 값이 0이면 전단계의 C<t-1> 그대로 유지되고 값이 1이면 새로운  tilda C<t> 의 값이 저장된다. GAMMA u 값이 스위치 역활을 하게 된다. C<t> 와 tilda C<t> 와 GAMMA u 는 모두 같은 사이즈의 vector이며 보라색 박스 안에서 * 연산은 elementwise 연산이 된다. 

image

기존의 GRU에서 좀더 확장된 알고리즘이며 보다 많이 사용된다. GAMMA r은 relevance gate이다. 이와 비슷한 역할을 하는 알고리즘으로 LSTM을 들수 있다. Full GRU 설명하는데 수식 왼쪽의 notation이 대신 사용되기도 한다. 


https://www.youtube.com/watch?v=5wh4HWWfZIY&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI&index=10

RNN W1L10 : Long Short Term Memory (LSTM)

image
image

gates 계산에 C<t-1>을 넣어서 하는 경우도 있는데 이를 peephole connection이라고 한다. LSTM가 GRU보다 먼저 개발된 알고리즘이다. LSTM은 gate 가 GRU보다 많기 때문에 자원소비가 좀더 있지만 좀더 강력하고 유연하다. LSTM은 보다 간단하기에 대형 network 제작에 좀더 유리하다. 전반적으로 LSTM을 좀더 사용하낟. 


https://www.youtube.com/watch?v=bTXGpATdKRY&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI&index=11

RNN W1L11 : Bidirectional RNN

기본 RNN의 경우 전단계의 데이터가 현단계에 적용될수 있지만 이후단계의 정보는 적용될수 없다. 이를 보완 하기 위한 방법으로 Bidirectional RNN를 사용하며 일반 RNN, LSTM, GRU에 사용될수 있다.

image
image

단하나의 순환 단계가 없는 acyclic graph 형태이다. y hat <3>를 계산하는경우 노란색의 방향으로 계산된다. x<1>  , a-><1> , x<2>  , a-><2> 와 반대방향으로 부터에서의 x<4>  , a-><4> 가 같이 들어와서 상단 수식에 의해 계산된다.  


https://www.youtube.com/watch?v=U7wN1x8zsG8&index=12&list=PL1w8k37X_6L_s4ncq-swTBvKDWnRSrinI

RNN W1L12 : Deep RNNs

기존의 RNN보다 층이 여러층으로 늘어났다. 각 층은 세로 가로로 연결되어 있다. 

보통은 3층(가로로도 연결된 층)을 초과해서는 만들지 않는다. 하단부에 가로 세로로 연결된 층을 만들고 상단부에는 세로로만 쌓여있는 network를 만들기도 한다.  

https://www.coursera.org/learn/machine-learning/lecture/iDBMm/problem-description-and-pipeline

PHOTO OCR – Problem Description and Pipeline

image

photo ocr ( optical character recognition ) 문제를 이 강의에서 주제로 다루고 있다.

사진에서 어떤 과정을 거쳐 글자를 인식하는지 보여주고 있다. 비슷한 문자로 인한 문제를 해결하는 작업이 추가되어야 하나 여기서는 생략한다. ( 예를 들어 1 과 l 을 혼동하는 경우를 대비해서 수정하는 작업 )

image

하나의 큰 작업을 수행하기 위해 연속된 작은 단위의 작업으로 나누어 수행하는 것을 machine learning pipeline이라고 한다. 보통 각 작은 단위에 1-5 명의 엔지니어가 작업을 수행한다. 


https://www.coursera.org/learn/machine-learning/lecture/bQhq3/sliding-windows

PHOTO OCR – Sliding Windows

image
image
image

그림에서 보듯이 녹색 박스를 좌우 상하로 움직이며 해당 물체를 찾는다. 박스크기를 다양하게 해서 여러 크기의 해당 물체를 찾아낸다.

image
image
image

위 그림 좌하단의 하얀 박스를 조금 넓혀서 글자의 영역을 넓혀준것이 우하단의 사진이다.

image

글자간의 여백을 찾아 각각의 글자를 구별한다.


https://www.coursera.org/learn/machine-learning/lecture/K0XQT/getting-lots-of-data-and-artificial-data

Getting Lots of Data and Artificial Data

좀더 많은 데이터가 필요한 상황인지 아닌지 파악하고 손수 데이터를 추가하는 것이 좋은지 가상의 데이터를 만드는 것이 좋은지 어떻게 판단하면 좋은지 설명하고 있다.

image

ocr 작업의 경우 다양한 폰트를 이용해서 가상데이터를 추가할수 있다.

image

ocr 작업의 경우 이미지 변형을 통해 가상데이터를 추가할수 있다.

image

음성인식인경우 다른 레이어의 음성을 추가할수 있다.

image
image

biased 된 알고리즘의 경우 데이터를 추가해도 별 효과가 없으므로 먼저 알고리즘이 biased된것이 아닌지 확인한다. 인공데이터를 이용할지 손수 데이터를 수집할지 다른 데이터 제공을 이용할지 결정하한다.


https://www.coursera.org/learn/machine-learning/lecture/LrJbq/ceiling-analysis-what-part-of-the-pipeline-to-work-on-next

Ceiling Analysis: What Part of the Pipeline to Work on Next

machine learning pipeline 작업을 하는 경우 어느 component의 성능이 가장 부족한지를 파악하고 그 부분에 신경을 쓰는 것이 좋다. 어떻게 어느 부분이 문제가 있는지 확인하는 작업에 대해 설명하고 있다.

image

각 단계를 최상으로 맞추고 전체 성능이 어떻게 변화하는지 확인한다. 예를 들어 text detection이 100%인 데이터를 넣고 알고리즘을 확인, 또 그다음에는 character segementation이 100%된 데이터를 넣고 확인한다.


https://www.coursera.org/learn/machine-learning/lecture/eYaD4/summary-and-thank-you

Summary

https://www.coursera.org/learn/machine-learning/lecture/iDBMm/problem-description-and-pipeline

PHOTO OCR – Problem Description and Pipeline

image

photo ocr ( optical character recognition ) 문제를 이 강의에서 주제로 다루고 있다.

사진에서 어떤 과정을 거쳐 글자를 인식하는지 보여주고 있다. 비슷한 문자로 인한 문제를 해결하는 작업이 추가되어야 하나 여기서는 생략한다. ( 예를 들어 1 과 l 을 혼동하는 경우를 대비해서 수정하는 작업 )

image

하나의 큰 작업을 수행하기 위해 연속된 작은 단위의 작업으로 나누어 수행하는 것을 machine learning pipeline이라고 한다. 보통 각 작은 단위에 1-5 명의 엔지니어가 작업을 수행한다. 


https://www.coursera.org/learn/machine-learning/lecture/bQhq3/sliding-windows

PHOTO OCR – Sliding Windows

image
image
image

그림에서 보듯이 녹색 박스를 좌우 상하로 움직이며 해당 물체를 찾는다. 박스크기를 다양하게 해서 여러 크기의 해당 물체를 찾아낸다.

image
image
image

위 그림 좌하단의 하얀 박스를 조금 넓혀서 글자의 영역을 넓혀준것이 우하단의 사진이다.

image

글자간의 여백을 찾아 각각의 글자를 구별한다.


https://www.coursera.org/learn/machine-learning/lecture/K0XQT/getting-lots-of-data-and-artificial-data

Getting Lots of Data and Artificial Data

좀더 많은 데이터가 필요한 상황인지 아닌지 파악하고 손수 데이터를 추가하는 것이 좋은지 가상의 데이터를 만드는 것이 좋은지 어떻게 판단하면 좋은지 설명하고 있다.

image

ocr 작업의 경우 다양한 폰트를 이용해서 가상데이터를 추가할수 있다.

image

ocr 작업의 경우 이미지 변형을 통해 가상데이터를 추가할수 있다.

image

음성인식인경우 다른 레이어의 음성을 추가할수 있다.

image
image

biased 된 알고리즘의 경우 데이터를 추가해도 별 효과가 없으므로 먼저 알고리즘이 biased된것이 아닌지 확인한다. 인공데이터를 이용할지 손수 데이터를 수집할지 다른 데이터 제공을 이용할지 결정하한다.


https://www.coursera.org/learn/machine-learning/lecture/LrJbq/ceiling-analysis-what-part-of-the-pipeline-to-work-on-next

Ceiling Analysis: What Part of the Pipeline to Work on Next

machine learning pipeline 작업을 하는 경우 어느 component의 성능이 가장 부족한지를 파악하고 그 부분에 신경을 쓰는 것이 좋다. 어떻게 어느 부분이 문제가 있는지 확인하는 작업에 대해 설명하고 있다.

image

각 단계를 최상으로 맞추고 전체 성능이 어떻게 변화하는지 확인한다. 예를 들어 text detection이 100%인 데이터를 넣고 알고리즘을 확인, 또 그다음에는 character segementation이 100%된 데이터를 넣고 확인한다.


https://www.coursera.org/learn/machine-learning/lecture/eYaD4/summary-and-thank-you

Summary

String concatenation is as simple as combining two strings with the + operator, and string mutability is managed by choosing between a constant or a variable.Every string is composed of encoding-independent Unicode characters.

NOTE

Swift’s String type is bridged with Foundation’s NSString class. Foundation also extends String to expose methods defined by NSString. This means, if you import Foundation, you can access those NSStringmethods on String without casting.

String Literals

Use a string literal as an initial value for a constant or variable:

let someString = “Some string literal value”

Multiline String Literals

let quotation = “”“
The White Rabbit put on his spectacles.  "Where shall I begin,
please your Majesty?” he asked

“Begin at the beginning,” the King said gravely, “and go on
till you come to the end; then stop.”
“"”

When your source code includes a line break inside of a multiline string literal, that line break also appears in the string’s value. If you want to use line breaks to make your source code easier to read, but you don’t want the line breaks to be part of the string’s value, write a backslash () at the end of those lines:

let softWrappedQuotation = ”“”
The White Rabbit put on his spectacles.  "Where shall I begin,
please your Majesty?“ he asked.

"Begin at the beginning,” the King said gravely, “and go on
till you come to the end; then stop.”
“"”

To make a multiline string literal that begins or ends with a line feed, write a blank line as the first or last line. For example:

let lineBreaks = ”“”

This string starts with a line break.
It also ends with a line break.

“”“

A multiline string can be indented to match the surrounding code. The whitespace before the closing quotation marks (""") tells Swift what whitespace to ignore before all of the other lines. However, if you write whitespace at the beginning of a line in addition to what’s before the closing quotation marks, that whitespace is included.

image

Special Characters in String Literals

String literals can include the following special characters:

  • The escaped special characters (null character), \ (backslash), t (horizontal tab), n (line feed), r(carriage return), " (double quotation mark) and ' (single quotation mark)
  • An arbitrary Unicode scalar, written as u{n}, where n is a 1–8 digit hexadecimal number with a value equal to a valid Unicode code point

Because multiline string literals use three double quotation marks instead of just one, you can include a double quotation mark (") inside of a multiline string literal without escaping it.

Initializing an Empty String

var emptyString = ”“               // empty string literal
var anotherEmptyString = String()  // initializer syntax
// these two strings are both empty, and are equivalent to each other

Find out whether a String value is empty by checking its Boolean isEmpty property:

if emptyString.isEmpty {
   print("Nothing to see here”)
}
// Prints “Nothing to see here”

String Mutability

var variableString = “Horse”
variableString += “ and carriage”
// variableString is now “Horse and carriage”

let constantString = “Highlander”
constantString += “ and another Highlander”
// this reports a compile-time error – a constant string cannot be modified

Strings Are Value Types

Swift’s String type is a value type. If you create a new String value, that String value is copied when it’s passed to a function or method, or when it’s assigned to a constant or variable. In each case, a new copy of the existing String value is created, and the new copy is passed or assigned, not the original version. (copy by value, not reference)

Working with Characters

for character in “Dog!🐶” {
   print(character)
}
// D
// o
// g
// !
// 🐶

let exclamationMark: Character = “!”

캐릭터 배열을 통해 string 만드는법

let catCharacters: [Character] = [“C”, “a”, “t”, “!”, “🐱”]
let catString = String(catCharacters)
print(catString)
// Prints “Cat!🐱”

Concatenating Strings and Characters

let string1 = “hello”

let string2 = “ there”

var welcome = string1 + string2

// welcome now equals “hello there”

var instruction = “look over”
instruction += string2
// instruction now equals “look over there”

let exclamationMark: Character = “!”
welcome.append(exclamationMark)
// welcome now equals “hello there!”

NOTE

You can’t append a String or Character to an existing Character variable, because a Character value must contain a single character only.

String Interpolation

Each item that you insert into the string literal is wrapped in a pair of parentheses, prefixed by a backslash ():

let multiplier = 3
let message = “(multiplier) times 2.5 is (Double(multiplier) * 2.5)”
// message is “3 times 2.5 is 7.5”

Unicode

Swift’s String and Character types are fully Unicode-compliant.

Unicode Scalars

NOTE

A Unicode scalar is any Unicode code point in the range U+0000 to U+D7FF inclusive or U+E000 to U+10FFFFinclusive. Unicode scalars don’t include the Unicode surrogate pair code points, which are the code points in the range U+D800 to U+DFFF inclusive.

Accessing and Modifying a String

You access and modify a string through its methods and properties, or by using subscript syntax.

String Indices

Each String value has an associated index type, String.Index, which corresponds to the position of each Character in the string.

As mentioned above, different characters can require different amounts of memory to store, so in order to determine which Character is at a particular position, you must iterate over each Unicode scalar from the start or end of that String. For this reason, Swift strings can’t be indexed by integer values.

Use the startIndex property to access the position of the first Character of a String. The endIndex property is the position after the last character in a String. As a result, the endIndex property isn’t a valid argument to a string’s subscript. If a String is empty, startIndex and endIndex are equal.

You access the indices before and after a given index using the index(before:) and index(after:) methods of String. To access an index farther away from the given index, you can use the index(_:offsetBy:) method instead of calling one of these methods multiple times.

let greeting = “Guten Tag!”
greeting[greeting.startIndex]
// G
greeting[greeting.index(before: greeting.endIndex)]
// !
greeting[greeting.index(after: greeting.startIndex)]
// u
let index = greeting.index(greeting.startIndex, offsetBy: 7)
greeting[index]
// a

greeting[greeting.endIndex] // Error
greeting.index(after: greeting.endIndex) // Error

Use the indices property to access all of the indices of individual characters in a string.

for index in greeting.indices {
   print(“(greeting[index]) ”, terminator: “”)
}
// Prints “G u t e n   T a g ! ”

NOTE

You can use the startIndex and endIndex properties and the index(before:), index(after:), and index(_:offsetBy:) methods on any type that conforms to the Collection protocol. This includes String, as shown here, as well as collection types such as Array, Dictionary, and Set.

(string이나 array나 dictionary나 set은 같은 방법으로 접근 가능하다는 이야기)

Inserting and Removing

insert(_:at:) 

insert(contentsOf:at:) method.

var welcome = “hello”
welcome.insert(“!”, at: welcome.endIndex)
// welcome now equals “hello!”
welcome.insert(contentsOf: “ there”, at: welcome.index(before: welcome.endIndex))
// welcome now equals “hello there!”

remove(at:)

removeSubrange(_:)

welcome.remove(at: welcome.index(before: welcome.endIndex))
// welcome now equals “hello there”

let range = welcome.index(welcome.endIndex, offsetBy: -6)..<welcome.endIndex

welcome.removeSubrange(range)
// welcome now equals “hello”

NOTE

You can use the insert(_:at:), insert(contentsOf:at:), remove(at:), and removeSubrange(_:) methods on any type that conforms to the RangeReplaceableCollection protocol. This includes String, as shown here, as well as collection types such as Array, Dictionary, and Set.

(string이나 array나 dictionary나 set은 같은 방법으로 접근 가능하다는 이야기)

Substrings

When you get a substring from a string—for example, using a subscript or a method like prefix(_:)—the result is an instance of Substring, not another string. Substrings in Swift have most of the same methods as strings, which means you can work with substrings the same way you work with strings. However, unlike strings, you use substrings for only a short amount of time while performing actions on a string. When you’re ready to store the result for a longer time, you convert the substring to an instance of String

let greeting = “Hello, world!”
let index = greeting.index(of: “,”) ?? greeting.endIndex
let beginning = greeting[..<index]
// beginning is “Hello”
// Convert the result to a String for long-term storage.
let newString = String(beginning)

NOTE

Both String and Substring conform to the StringProtocol protocol, which means it’s often convenient for string-manipulation functions to accept a StringProtocol value. You can call such functions with either a String or Substring value.

Comparing Strings

Swift provides three ways to compare textual values: string and character equality, prefix equality, and suffix equality.

String and Character Equality

String and character equality is checked with the “equal to” operator (==) and the “not equal to” operator (!=), as described in Comparison Operators:

let quotation = “We’re a lot alike, you and I.”
let sameQuotation = “We’re a lot alike, you and I.”
if quotation == sameQuotation {
   print(“These two strings are considered equal”)
}
// Prints “These two strings are considered equal”

Two String values (or two Character values) are considered equal if their extended grapheme clusters are canonically equivalent. Extended grapheme clusters are canonically equivalent if they have the same linguistic meaning and appearance, even if they’re composed from different Unicode scalars behind the scenes.

Prefix and Suffix Equality

To check whether a string has a particular string prefix or suffix, call the string’s hasPrefix(_:) and hasSuffix(_:) methods, both of which take a single argument of type String and return a Boolean value.

let romeoAndJuliet = [
   "Act 1 Scene 1: Verona, A public place",
   "Act 1 Scene 2: Capulet’s mansion",
   "Act 1 Scene 3: A room in Capulet’s mansion",
   "Act 1 Scene 4: A street outside Capulet’s mansion",
   "Act 1 Scene 5: The Great Hall in Capulet’s mansion",
   "Act 2 Scene 1: Outside Capulet’s mansion",
   "Act 2 Scene 2: Capulet’s orchard",
   "Act 2 Scene 3: Outside Friar Lawrence’s cell",
   "Act 2 Scene 4: A street in Verona",
   "Act 2 Scene 5: Capulet’s mansion",
   "Act 2 Scene 6: Friar Lawrence’s cell"
]

var act1SceneCount = 0
for scene in romeoAndJuliet {
   if scene.hasPrefix(“Act 1 ”) {
       act1SceneCount += 1
   }
}
print(“There are (act1SceneCount) scenes in Act 1”)
// Prints “There are 5 scenes in Act 1”

var mansionCount = 0
var cellCount = 0
for scene in romeoAndJuliet {
   if scene.hasSuffix(“Capulet’s mansion”) {
       mansionCount += 1
   } else if scene.hasSuffix(“Friar Lawrence’s cell”) {
       cellCount += 1
   }
}
print(“(mansionCount) mansion scenes; (cellCount) cell scenes”)
// Prints “6 mansion scenes; 2 cell scenes”

Unicode Representations of Strings

When a Unicode string is written to a text file or some other storage, the Unicode scalars in that string are encoded in one of several Unicode-defined encoding forms. Each form encodes the string in small chunks known as code units. These include the UTF-8 encoding form (which encodes a string as 8-bit code units), the UTF-16 encoding form (which encodes a string as 16-bit code units), and the UTF-32 encoding form (which encodes a string as 32-bit code units).

Swift provides several different ways to access Unicode representations of strings. You can iterate over the string with a forin statement, to access its individual Character values as Unicode extended grapheme clusters. This process is described in Working with Characters.

Alternatively, access a String value in one of three other Unicode-compliant representations:

  • A collection of UTF-8 code units (accessed with the string’s utf8 property)
  • A collection of UTF-16 code units (accessed with the string’s utf16 property)
  • A collection of 21-bit Unicode scalar values, equivalent to the string’s UTF-32 encoding form (accessed with the string’s unicodeScalars property)

(unicode라고 하더라도 하나의  character가 가지는 크기에 따라 위에 같이 나뉘게된다. 그렇지만  ascii가 차지하는것은 같은크기 )

let dogString = “Dog‼🐶”

for codeUnit in dogString.utf8 {
   print(“(codeUnit) ”, terminator: “”)
}
print(“”)
// Prints “68 111 103 226 128 188 240 159 144 182 ”

for codeUnit in dogString.utf16 {
   print(“(codeUnit) ”, terminator: “”)
}
print(“”)
// Prints “68 111 103 8252 55357 56374 ”

for scalar in dogString.unicodeScalars {
   print(“(scalar.value) ”, terminator: “”)
}
print(“”)
// Prints “68 111 103 8252 128054 ”

for scalar in dogString.unicodeScalars {
   print(“(scalar) ”)
}
// D
// o
// g
// ‼
// 🐶

String concatenation is as simple as combining two strings with the + operator, and string mutability is managed by choosing between a constant or a variable.Every string is composed of encoding-independent Unicode characters.

NOTE

Swift’s String type is bridged with Foundation’s NSString class. Foundation also extends String to expose methods defined by NSString. This means, if you import Foundation, you can access those NSStringmethods on String without casting.

String Literals

Use a string literal as an initial value for a constant or variable:

let someString = “Some string literal value”

Multiline String Literals

let quotation = “”“
The White Rabbit put on his spectacles.  "Where shall I begin,
please your Majesty?” he asked

“Begin at the beginning,” the King said gravely, “and go on
till you come to the end; then stop.”
“"”

When your source code includes a line break inside of a multiline string literal, that line break also appears in the string’s value. If you want to use line breaks to make your source code easier to read, but you don’t want the line breaks to be part of the string’s value, write a backslash () at the end of those lines:

let softWrappedQuotation = ”“”
The White Rabbit put on his spectacles.  "Where shall I begin,
please your Majesty?“ he asked.

"Begin at the beginning,” the King said gravely, “and go on
till you come to the end; then stop.”
“"”

To make a multiline string literal that begins or ends with a line feed, write a blank line as the first or last line. For example:

let lineBreaks = ”“”

This string starts with a line break.
It also ends with a line break.

“”“

A multiline string can be indented to match the surrounding code. The whitespace before the closing quotation marks (""") tells Swift what whitespace to ignore before all of the other lines. However, if you write whitespace at the beginning of a line in addition to what’s before the closing quotation marks, that whitespace is included.

image

Special Characters in String Literals

String literals can include the following special characters:

  • The escaped special characters (null character), \ (backslash), t (horizontal tab), n (line feed), r(carriage return), " (double quotation mark) and ' (single quotation mark)
  • An arbitrary Unicode scalar, written as u{n}, where n is a 1–8 digit hexadecimal number with a value equal to a valid Unicode code point

Because multiline string literals use three double quotation marks instead of just one, you can include a double quotation mark (") inside of a multiline string literal without escaping it.

Initializing an Empty String

var emptyString = ”“               // empty string literal
var anotherEmptyString = String()  // initializer syntax
// these two strings are both empty, and are equivalent to each other

Find out whether a String value is empty by checking its Boolean isEmpty property:

if emptyString.isEmpty {
   print("Nothing to see here”)
}
// Prints “Nothing to see here”

String Mutability

var variableString = “Horse”
variableString += “ and carriage”
// variableString is now “Horse and carriage”

let constantString = “Highlander”
constantString += “ and another Highlander”
// this reports a compile-time error – a constant string cannot be modified

Strings Are Value Types

Swift’s String type is a value type. If you create a new String value, that String value is copied when it’s passed to a function or method, or when it’s assigned to a constant or variable. In each case, a new copy of the existing String value is created, and the new copy is passed or assigned, not the original version. (copy by value, not reference)

Working with Characters

for character in “Dog!🐶” {
   print(character)
}
// D
// o
// g
// !
// 🐶

let exclamationMark: Character = “!”

캐릭터 배열을 통해 string 만드는법

let catCharacters: [Character] = [“C”, “a”, “t”, “!”, “🐱”]
let catString = String(catCharacters)
print(catString)
// Prints “Cat!🐱”

Concatenating Strings and Characters

let string1 = “hello”

let string2 = “ there”

var welcome = string1 + string2

// welcome now equals “hello there”

var instruction = “look over”
instruction += string2
// instruction now equals “look over there”

let exclamationMark: Character = “!”
welcome.append(exclamationMark)
// welcome now equals “hello there!”

NOTE

You can’t append a String or Character to an existing Character variable, because a Character value must contain a single character only.

String Interpolation

Each item that you insert into the string literal is wrapped in a pair of parentheses, prefixed by a backslash ():

let multiplier = 3
let message = “(multiplier) times 2.5 is (Double(multiplier) * 2.5)”
// message is “3 times 2.5 is 7.5”

Unicode

Swift’s String and Character types are fully Unicode-compliant.

Unicode Scalars

NOTE

A Unicode scalar is any Unicode code point in the range U+0000 to U+D7FF inclusive or U+E000 to U+10FFFFinclusive. Unicode scalars don’t include the Unicode surrogate pair code points, which are the code points in the range U+D800 to U+DFFF inclusive.

Accessing and Modifying a String

You access and modify a string through its methods and properties, or by using subscript syntax.

String Indices

Each String value has an associated index type, String.Index, which corresponds to the position of each Character in the string.

As mentioned above, different characters can require different amounts of memory to store, so in order to determine which Character is at a particular position, you must iterate over each Unicode scalar from the start or end of that String. For this reason, Swift strings can’t be indexed by integer values.

Use the startIndex property to access the position of the first Character of a String. The endIndex property is the position after the last character in a String. As a result, the endIndex property isn’t a valid argument to a string’s subscript. If a String is empty, startIndex and endIndex are equal.

You access the indices before and after a given index using the index(before:) and index(after:) methods of String. To access an index farther away from the given index, you can use the index(_:offsetBy:) method instead of calling one of these methods multiple times.

let greeting = “Guten Tag!”
greeting[greeting.startIndex]
// G
greeting[greeting.index(before: greeting.endIndex)]
// !
greeting[greeting.index(after: greeting.startIndex)]
// u
let index = greeting.index(greeting.startIndex, offsetBy: 7)
greeting[index]
// a

greeting[greeting.endIndex] // Error
greeting.index(after: greeting.endIndex) // Error

Use the indices property to access all of the indices of individual characters in a string.

for index in greeting.indices {
   print(“(greeting[index]) ”, terminator: “”)
}
// Prints “G u t e n   T a g ! ”

NOTE

You can use the startIndex and endIndex properties and the index(before:), index(after:), and index(_:offsetBy:) methods on any type that conforms to the Collection protocol. This includes String, as shown here, as well as collection types such as Array, Dictionary, and Set.

(string이나 array나 dictionary나 set은 같은 방법으로 접근 가능하다는 이야기)

Inserting and Removing

insert(_:at:) 

insert(contentsOf:at:) method.

var welcome = “hello”
welcome.insert(“!”, at: welcome.endIndex)
// welcome now equals “hello!”
welcome.insert(contentsOf: “ there”, at: welcome.index(before: welcome.endIndex))
// welcome now equals “hello there!”

remove(at:)

removeSubrange(_:)

welcome.remove(at: welcome.index(before: welcome.endIndex))
// welcome now equals “hello there”

let range = welcome.index(welcome.endIndex, offsetBy: -6)..<welcome.endIndex

welcome.removeSubrange(range)
// welcome now equals “hello”

NOTE

You can use the insert(_:at:), insert(contentsOf:at:), remove(at:), and removeSubrange(_:) methods on any type that conforms to the RangeReplaceableCollection protocol. This includes String, as shown here, as well as collection types such as Array, Dictionary, and Set.

(string이나 array나 dictionary나 set은 같은 방법으로 접근 가능하다는 이야기)

Substrings

When you get a substring from a string—for example, using a subscript or a method like prefix(_:)—the result is an instance of Substring, not another string. Substrings in Swift have most of the same methods as strings, which means you can work with substrings the same way you work with strings. However, unlike strings, you use substrings for only a short amount of time while performing actions on a string. When you’re ready to store the result for a longer time, you convert the substring to an instance of String

let greeting = “Hello, world!”
let index = greeting.index(of: “,”) ?? greeting.endIndex
let beginning = greeting[..<index]
// beginning is “Hello”
// Convert the result to a String for long-term storage.
let newString = String(beginning)

NOTE

Both String and Substring conform to the StringProtocol protocol, which means it’s often convenient for string-manipulation functions to accept a StringProtocol value. You can call such functions with either a String or Substring value.

Comparing Strings

Swift provides three ways to compare textual values: string and character equality, prefix equality, and suffix equality.

String and Character Equality

String and character equality is checked with the “equal to” operator (==) and the “not equal to” operator (!=), as described in Comparison Operators:

let quotation = “We’re a lot alike, you and I.”
let sameQuotation = “We’re a lot alike, you and I.”
if quotation == sameQuotation {
   print(“These two strings are considered equal”)
}
// Prints “These two strings are considered equal”

Two String values (or two Character values) are considered equal if their extended grapheme clusters are canonically equivalent. Extended grapheme clusters are canonically equivalent if they have the same linguistic meaning and appearance, even if they’re composed from different Unicode scalars behind the scenes.

Prefix and Suffix Equality

To check whether a string has a particular string prefix or suffix, call the string’s hasPrefix(_:) and hasSuffix(_:) methods, both of which take a single argument of type String and return a Boolean value.

let romeoAndJuliet = [
   "Act 1 Scene 1: Verona, A public place",
   "Act 1 Scene 2: Capulet’s mansion",
   "Act 1 Scene 3: A room in Capulet’s mansion",
   "Act 1 Scene 4: A street outside Capulet’s mansion",
   "Act 1 Scene 5: The Great Hall in Capulet’s mansion",
   "Act 2 Scene 1: Outside Capulet’s mansion",
   "Act 2 Scene 2: Capulet’s orchard",
   "Act 2 Scene 3: Outside Friar Lawrence’s cell",
   "Act 2 Scene 4: A street in Verona",
   "Act 2 Scene 5: Capulet’s mansion",
   "Act 2 Scene 6: Friar Lawrence’s cell"
]

var act1SceneCount = 0
for scene in romeoAndJuliet {
   if scene.hasPrefix(“Act 1 ”) {
       act1SceneCount += 1
   }
}
print(“There are (act1SceneCount) scenes in Act 1”)
// Prints “There are 5 scenes in Act 1”

var mansionCount = 0
var cellCount = 0
for scene in romeoAndJuliet {
   if scene.hasSuffix(“Capulet’s mansion”) {
       mansionCount += 1
   } else if scene.hasSuffix(“Friar Lawrence’s cell”) {
       cellCount += 1
   }
}
print(“(mansionCount) mansion scenes; (cellCount) cell scenes”)
// Prints “6 mansion scenes; 2 cell scenes”

Unicode Representations of Strings

When a Unicode string is written to a text file or some other storage, the Unicode scalars in that string are encoded in one of several Unicode-defined encoding forms. Each form encodes the string in small chunks known as code units. These include the UTF-8 encoding form (which encodes a string as 8-bit code units), the UTF-16 encoding form (which encodes a string as 16-bit code units), and the UTF-32 encoding form (which encodes a string as 32-bit code units).

Swift provides several different ways to access Unicode representations of strings. You can iterate over the string with a forin statement, to access its individual Character values as Unicode extended grapheme clusters. This process is described in Working with Characters.

Alternatively, access a String value in one of three other Unicode-compliant representations:

  • A collection of UTF-8 code units (accessed with the string’s utf8 property)
  • A collection of UTF-16 code units (accessed with the string’s utf16 property)
  • A collection of 21-bit Unicode scalar values, equivalent to the string’s UTF-32 encoding form (accessed with the string’s unicodeScalars property)

(unicode라고 하더라도 하나의  character가 가지는 크기에 따라 위에 같이 나뉘게된다. 그렇지만  ascii가 차지하는것은 같은크기 )

let dogString = “Dog‼🐶”

for codeUnit in dogString.utf8 {
   print(“(codeUnit) ”, terminator: “”)
}
print(“”)
// Prints “68 111 103 226 128 188 240 159 144 182 ”

for codeUnit in dogString.utf16 {
   print(“(codeUnit) ”, terminator: “”)
}
print(“”)
// Prints “68 111 103 8252 55357 56374 ”

for scalar in dogString.unicodeScalars {
   print(“(scalar.value) ”, terminator: “”)
}
print(“”)
// Prints “68 111 103 8252 128054 ”

for scalar in dogString.unicodeScalars {
   print(“(scalar) ”)
}
// D
// o
// g
// ‼
// 🐶

String concatenation is as simple as combining two strings with the + operator, and string mutability is managed by choosing between a constant or a variable.Every string is composed of encoding-independent Unicode characters.

NOTE

Swift’s String type is bridged with Foundation’s NSString class. Foundation also extends String to expose methods defined by NSString. This means, if you import Foundation, you can access those NSStringmethods on String without casting.

String Literals

Use a string literal as an initial value for a constant or variable:

let someString = “Some string literal value”

Multiline String Literals

let quotation = “”“
The White Rabbit put on his spectacles.  "Where shall I begin,
please your Majesty?” he asked

“Begin at the beginning,” the King said gravely, “and go on
till you come to the end; then stop.”
“"”

When your source code includes a line break inside of a multiline string literal, that line break also appears in the string’s value. If you want to use line breaks to make your source code easier to read, but you don’t want the line breaks to be part of the string’s value, write a backslash () at the end of those lines:

let softWrappedQuotation = ”“”
The White Rabbit put on his spectacles.  "Where shall I begin,
please your Majesty?“ he asked.

"Begin at the beginning,” the King said gravely, “and go on
till you come to the end; then stop.”
“"”

To make a multiline string literal that begins or ends with a line feed, write a blank line as the first or last line. For example:

let lineBreaks = ”“”

This string starts with a line break.
It also ends with a line break.

“”“

A multiline string can be indented to match the surrounding code. The whitespace before the closing quotation marks (""") tells Swift what whitespace to ignore before all of the other lines. However, if you write whitespace at the beginning of a line in addition to what’s before the closing quotation marks, that whitespace is included.

image

Special Characters in String Literals

String literals can include the following special characters:

  • The escaped special characters (null character), \ (backslash), t (horizontal tab), n (line feed), r(carriage return), " (double quotation mark) and ' (single quotation mark)
  • An arbitrary Unicode scalar, written as u{n}, where n is a 1–8 digit hexadecimal number with a value equal to a valid Unicode code point

Because multiline string literals use three double quotation marks instead of just one, you can include a double quotation mark (") inside of a multiline string literal without escaping it.

Initializing an Empty String

var emptyString = ”“               // empty string literal
var anotherEmptyString = String()  // initializer syntax
// these two strings are both empty, and are equivalent to each other

Find out whether a String value is empty by checking its Boolean isEmpty property:

if emptyString.isEmpty {
   print("Nothing to see here”)
}
// Prints “Nothing to see here”

String Mutability

var variableString = “Horse”
variableString += “ and carriage”
// variableString is now “Horse and carriage”

let constantString = “Highlander”
constantString += “ and another Highlander”
// this reports a compile-time error – a constant string cannot be modified

Strings Are Value Types

Swift’s String type is a value type. If you create a new String value, that String value is copied when it’s passed to a function or method, or when it’s assigned to a constant or variable. In each case, a new copy of the existing String value is created, and the new copy is passed or assigned, not the original version. (copy by value, not reference)

Working with Characters

for character in “Dog!?” {
   print(character)
}
// D
// o
// g
// !
// ?

let exclamationMark: Character = “!”

캐릭터 배열을 통해 string 만드는법

let catCharacters: [Character] = [“C”, “a”, “t”, “!”, “?”]
let catString = String(catCharacters)
print(catString)
// Prints “Cat!?”

Concatenating Strings and Characters

let string1 = “hello”

let string2 = “ there”

var welcome = string1 + string2

// welcome now equals “hello there”

var instruction = “look over”
instruction += string2
// instruction now equals “look over there”

let exclamationMark: Character = “!”
welcome.append(exclamationMark)
// welcome now equals “hello there!”

NOTE

You can’t append a String or Character to an existing Character variable, because a Character value must contain a single character only.

String Interpolation

Each item that you insert into the string literal is wrapped in a pair of parentheses, prefixed by a backslash ():

let multiplier = 3
let message = “(multiplier) times 2.5 is (Double(multiplier) * 2.5)”
// message is “3 times 2.5 is 7.5”

Unicode

Swift’s String and Character types are fully Unicode-compliant.

Unicode Scalars

NOTE

A Unicode scalar is any Unicode code point in the range U+0000 to U+D7FF inclusive or U+E000 to U+10FFFFinclusive. Unicode scalars don’t include the Unicode surrogate pair code points, which are the code points in the range U+D800 to U+DFFF inclusive.

Accessing and Modifying a String

You access and modify a string through its methods and properties, or by using subscript syntax.

String Indices

Each String value has an associated index type, String.Index, which corresponds to the position of each Character in the string.

As mentioned above, different characters can require different amounts of memory to store, so in order to determine which Character is at a particular position, you must iterate over each Unicode scalar from the start or end of that String. For this reason, Swift strings can’t be indexed by integer values.

Use the startIndex property to access the position of the first Character of a String. The endIndex property is the position after the last character in a String. As a result, the endIndex property isn’t a valid argument to a string’s subscript. If a String is empty, startIndex and endIndex are equal.

You access the indices before and after a given index using the index(before:) and index(after:) methods of String. To access an index farther away from the given index, you can use the index(_:offsetBy:) method instead of calling one of these methods multiple times.

let greeting = “Guten Tag!”
greeting[greeting.startIndex]
// G
greeting[greeting.index(before: greeting.endIndex)]
// !
greeting[greeting.index(after: greeting.startIndex)]
// u
let index = greeting.index(greeting.startIndex, offsetBy: 7)
greeting[index]
// a

greeting[greeting.endIndex] // Error
greeting.index(after: greeting.endIndex) // Error

Use the indices property to access all of the indices of individual characters in a string.

for index in greeting.indices {
   print(“(greeting[index]) ”, terminator: “”)
}
// Prints “G u t e n   T a g ! ”

NOTE

You can use the startIndex and endIndex properties and the index(before:), index(after:), and index(_:offsetBy:) methods on any type that conforms to the Collection protocol. This includes String, as shown here, as well as collection types such as Array, Dictionary, and Set.

(string이나 array나 dictionary나 set은 같은 방법으로 접근 가능하다는 이야기)

Inserting and Removing

insert(_:at:) 

insert(contentsOf:at:) method.

var welcome = “hello”
welcome.insert(“!”, at: welcome.endIndex)
// welcome now equals “hello!”
welcome.insert(contentsOf: “ there”, at: welcome.index(before: welcome.endIndex))
// welcome now equals “hello there!”

remove(at:)

removeSubrange(_:)

welcome.remove(at: welcome.index(before: welcome.endIndex))
// welcome now equals “hello there”

let range = welcome.index(welcome.endIndex, offsetBy: -6)..<welcome.endIndex

welcome.removeSubrange(range)
// welcome now equals “hello”

NOTE

You can use the insert(_:at:), insert(contentsOf:at:), remove(at:), and removeSubrange(_:) methods on any type that conforms to the RangeReplaceableCollection protocol. This includes String, as shown here, as well as collection types such as Array, Dictionary, and Set.

(string이나 array나 dictionary나 set은 같은 방법으로 접근 가능하다는 이야기)

Substrings

When you get a substring from a string—for example, using a subscript or a method like prefix(_:)—the result is an instance of Substring, not another string. Substrings in Swift have most of the same methods as strings, which means you can work with substrings the same way you work with strings. However, unlike strings, you use substrings for only a short amount of time while performing actions on a string. When you’re ready to store the result for a longer time, you convert the substring to an instance of String

let greeting = “Hello, world!”
let index = greeting.index(of: “,”) ?? greeting.endIndex
let beginning = greeting[..<index]
// beginning is “Hello”
// Convert the result to a String for long-term storage.
let newString = String(beginning)

NOTE

Both String and Substring conform to the StringProtocol protocol, which means it’s often convenient for string-manipulation functions to accept a StringProtocol value. You can call such functions with either a String or Substring value.

Comparing Strings

Swift provides three ways to compare textual values: string and character equality, prefix equality, and suffix equality.

String and Character Equality

String and character equality is checked with the “equal to” operator (==) and the “not equal to” operator (!=), as described in Comparison Operators:

let quotation = “We’re a lot alike, you and I.”
let sameQuotation = “We’re a lot alike, you and I.”
if quotation == sameQuotation {
   print(“These two strings are considered equal”)
}
// Prints “These two strings are considered equal”

Two String values (or two Character values) are considered equal if their extended grapheme clusters are canonically equivalent. Extended grapheme clusters are canonically equivalent if they have the same linguistic meaning and appearance, even if they’re composed from different Unicode scalars behind the scenes.

Prefix and Suffix Equality

To check whether a string has a particular string prefix or suffix, call the string’s hasPrefix(_:) and hasSuffix(_:) methods, both of which take a single argument of type String and return a Boolean value.

let romeoAndJuliet = [
   "Act 1 Scene 1: Verona, A public place",
   "Act 1 Scene 2: Capulet’s mansion",
   "Act 1 Scene 3: A room in Capulet’s mansion",
   "Act 1 Scene 4: A street outside Capulet’s mansion",
   "Act 1 Scene 5: The Great Hall in Capulet’s mansion",
   "Act 2 Scene 1: Outside Capulet’s mansion",
   "Act 2 Scene 2: Capulet’s orchard",
   "Act 2 Scene 3: Outside Friar Lawrence’s cell",
   "Act 2 Scene 4: A street in Verona",
   "Act 2 Scene 5: Capulet’s mansion",
   "Act 2 Scene 6: Friar Lawrence’s cell"
]

var act1SceneCount = 0
for scene in romeoAndJuliet {
   if scene.hasPrefix(“Act 1 ”) {
       act1SceneCount += 1
   }
}
print(“There are (act1SceneCount) scenes in Act 1”)
// Prints “There are 5 scenes in Act 1”

var mansionCount = 0
var cellCount = 0
for scene in romeoAndJuliet {
   if scene.hasSuffix(“Capulet’s mansion”) {
       mansionCount += 1
   } else if scene.hasSuffix(“Friar Lawrence’s cell”) {
       cellCount += 1
   }
}
print(“(mansionCount) mansion scenes; (cellCount) cell scenes”)
// Prints “6 mansion scenes; 2 cell scenes”

Unicode Representations of Strings

When a Unicode string is written to a text file or some other storage, the Unicode scalars in that string are encoded in one of several Unicode-defined encoding forms. Each form encodes the string in small chunks known as code units. These include the UTF-8 encoding form (which encodes a string as 8-bit code units), the UTF-16 encoding form (which encodes a string as 16-bit code units), and the UTF-32 encoding form (which encodes a string as 32-bit code units).

Swift provides several different ways to access Unicode representations of strings. You can iterate over the string with a forin statement, to access its individual Character values as Unicode extended grapheme clusters. This process is described in Working with Characters.

Alternatively, access a String value in one of three other Unicode-compliant representations:

  • A collection of UTF-8 code units (accessed with the string’s utf8 property)
  • A collection of UTF-16 code units (accessed with the string’s utf16 property)
  • A collection of 21-bit Unicode scalar values, equivalent to the string’s UTF-32 encoding form (accessed with the string’s unicodeScalars property)

(unicode라고 하더라도 하나의  character가 가지는 크기에 따라 위에 같이 나뉘게된다. 그렇지만  ascii가 차지하는것은 같은크기 )

let dogString = “Dog‼?”

for codeUnit in dogString.utf8 {
   print(“(codeUnit) ”, terminator: “”)
}
print(“”)
// Prints “68 111 103 226 128 188 240 159 144 182 ”

for codeUnit in dogString.utf16 {
   print(“(codeUnit) ”, terminator: “”)
}
print(“”)
// Prints “68 111 103 8252 55357 56374 ”

for scalar in dogString.unicodeScalars {
   print(“(scalar.value) ”, terminator: “”)
}
print(“”)
// Prints “68 111 103 8252 128054 ”

for scalar in dogString.unicodeScalars {
   print(“(scalar) ”)
}
// D
// o
// g
// ‼
// ?