How does String.Index work in Swift
Strings and Indexing it has been a pain to understand things.
Specifically I was trying the following:
let str = "Hello, playground"
let prefixRange = str.startIndex..<str.startIndex.advancedBy(5) // error
where the second line was giving me the following error
'advancedBy' is unavailable: To advance an index by n steps call 'index(_:offsetBy:)' on the CharacterView instance that produced the index.
I see that String
has the following methods.
str.index(after: String.Index)
str.index(before: String.Index)
str.index(String.Index, offsetBy: String.IndexDistance)
str.index(String.Index, offsetBy: String.IndexDistance, limitedBy: String.Index)
These were really confusing me at first so I started playing around with them until I understood them. I am adding an answer below to show how they are used.
All of the following examples use
var str = "Hello, playground"
startIndex
and endIndex
startIndex
is the index of the first characterendIndex
is the index after the last character.
Example
// character
str[str.startIndex] // H
str[str.endIndex] // error: after last character
// range
let range = str.startIndex..<str.endIndex
str[range] // "Hello, playground"
With Swift 4's one-sided ranges, the range can be simplified to one of the following forms.
let range = str.startIndex...
let range = ..<str.endIndex
I will use the full form in the follow examples for the sake of clarity, but for the sake of readability, you will probably want to use the one-sided ranges in your code.
after
As in: index(after: String.Index)
after
refers to the index of the character directly after the given index.
Examples
// character
let index = str.index(after: str.startIndex)
str[index] // "e"
// range
let range = str.index(after: str.startIndex)..<str.endIndex
str[range] // "ello, playground"
before
As in: index(before: String.Index)
before
refers to the index of the character directly before the given index.
Examples
// character
let index = str.index(before: str.endIndex)
str[index] // d
// range
let range = str.startIndex..<str.index(before: str.endIndex)
str[range] // Hello, playgroun
offsetBy
As in: index(String.Index, offsetBy: String.IndexDistance)
- The
offsetBy
value can be positive or negative and starts from the given index. Although it is of the typeString.IndexDistance
, you can give it anInt
.
Examples
// character
let index = str.index(str.startIndex, offsetBy: 7)
str[index] // p
// range
let start = str.index(str.startIndex, offsetBy: 7)
let end = str.index(str.endIndex, offsetBy: -6)
let range = start..<end
str[range] // play
limitedBy
As in: index(String.Index, offsetBy: String.IndexDistance, limitedBy: String.Index)
- The
limitedBy
is useful for making sure that the offset does not cause the index to go out of bounds. It is a bounding index. Since it is possible for the offset to exceed the limit, this method returns an Optional. It returnsnil
if the index is out of bounds.
Example
// character
if let index = str.index(str.startIndex, offsetBy: 7, limitedBy: str.endIndex) {
str[index] // p
}
If the offset had been 77
instead of 7
, then the if
statement would have been skipped.
Why is String.Index needed?
It would be much easier to use an Int
index for Strings. The reason that you have to create a new String.Index
for every String is that Characters in Swift are not all the same length under the hood. A single Swift Character might be composed of one, two, or even more Unicode code points. Thus each unique String must calculate the indexes of its Characters.
It is possible to hide this complexity behind an Int index extension, but I am reluctant to do so. It is good to be reminded of what is actually happening.
- 25Why would
startIndex
be anything else than 0? Jul 3 '17 at 11:44 - 23@RoboRobok: Because Swift works with Unicode characters, which are made of "grapheme clusters", Swift doesn't use integers to represent index locations. Let's say your first character is an
é
. It is actually made of thee
plus a\u{301}
Unicode representation. If you used an index of zero, you would get either thee
or the accent ("grave") character, not the entire cluster that makes up theé
. Using thestartIndex
ensures you'll get the entire grapheme cluster for any character.– leanneAug 18 '17 at 16:16 - 3In Swift 4.0 each Unicode characters are counted by 1. Eg: "👩💻".count // Now: 1, Before: 2– selvaSep 29 '17 at 6:50
- 3How does one construct a
String.Index
from an integer, other than building a dummy string and using the.index
method on it? I don't know if I'm missing something, but the docs don't say anything.– sudoOct 10 '17 at 20:43 - 3@sudo, you have to be a little careful when constructing a
String.Index
with an integer because each SwiftCharacter
does not necessarily equal the same thing you mean with an integer. That said, you can pass an integer into theoffsetBy
parameter to create aString.Index
. If you don't have aString
, though, then you can't construct aString.Index
(because Swift can only calculate the index if it knows what the previous characters in the string are). If you change the string then you must recalculate the index. You can't use the sameString.Index
on two different strings.– SuragchOct 11 '17 at 2:15