View Single Post
Old 2020-07-19, 21:46   #18
Dylan14
 
Dylan14's Avatar
 
"Dylan"
Mar 2017

23D16 Posts
Default Strings part 2 - indexing, repeating, and slicing

So far, we have dealt with the concept of strings, the fact that we can concatenate them, and using the format command to change how the string appears when outputted. But there are other things we can do with strings:
1. We can index the characters within a string with an index.
2. We can repeat a string multiple times.
3. We can slice a string, to extract only a certain part of the whole string.
There are other useful things, which I will mention briefly in the next section. More details can be found in the Python documentation, or elsewhere.
But first, let's revisit multiline strings:

Multiline strings:

Recall when I wrote about multiple line comments, I had an example as follows:
Code:
"""
Some comments need 
more than one line, 
like this one
 """
This is actually a valid string, as can be evidenced when I type this into an interpreter:
Code:
>>> msg = """Some comments need
more than one line,
 like this one
"""
 >>> print(msg)
Some comments need
more than one line,
like this one
In general, to do a multiline string like this, you would need to encase the 'stuff' in either triple double quotes ("""stuff""") or triple single quotes ('''stuff'''). Alternatively, you could use a single line string, and the escape sequence '\n' to start a new line, as follows:

Code:
>>> msg2 = "The rain in Spain \nfalls mainly on the plain."
>>> print(msg2)
The rain in Spain 
 falls mainly on the plain.
Indexing and slicing strings:

Like other languages, strings are arrays of bytes representing characters.

Unlike other languages, such as C, Python does not have a character data type (*). Instead, characters are just strings with length one.
Since the string is an array, I can index the elements of the string. A couple of notes:
1. Indexing starts from 0, which denotes the first character.
2. As a corollary of this, you can only go to the index n-1, which is the last character in the string. If you go beyond this, you will get an IndexError.
3. You can use negative indices. In this case you will index from the last character in the string, and work backwards. Here the furthest back you can go is -n, which is the first character in the string.

Examples:

Let's consider the following single line string:
Code:
The rain in Spain falls mainly on the plain.
To ensure that we don't get an IndexError, we will use the len command, which gives the length of the string.
Code:
>>> msg3 = "The rain in Spain falls mainly on the plain."
>>> len(msg3)
44
With this, the valid indices are -44 to 43.
1. Let's pick out the S in Spain. We count from 0, which is the T in the. 1 - h, 2 - e, 3 - space, 4 - r, 5 - a, 6 - i, 7 - n, 8 - space, 9 - i, 10 - n, 11 - space, 12 - S. So msg3[12] should be S. Let's check:
Code:
>>> print(msg3[12])
S
2. Alternatively, let's go backwards. -1 - ., -2 - n, -3 - i, ..., -32 - S. So msg3[-32] should also be S. Let's check:
Code:
>>> print(msg3[-32])
S
3. What happens if we try index 50?
Code:
>>> print(msg3[50])
Traceback (most recent call last):
  File "<pyshell#18>", line 1, in <module>
    print(msg3[50])
IndexError: string index out of range
We get an error, as 50 > 43.

Slicing a string:
Now, we will use indices to get a piece of the string. Slicing uses colons, and have three arguments, all of which have to be integers.
1. The first number denotes the initial index.
2. The second number denotes the final index.
3. The third number denotes the step. This can't be 0, and by default is 1 (meaning, print all characters between the initial index and the final index).
If you have at least one argument, the rest are optional. This goes as follows:
By default, the first argument is 0 (start at the beginning of the string), the second argument is len(string) - 1 (the end of the string) and the third argument is 1.

Examples: Again, let's consider the following string:
Code:
The rain in Spain falls mainly on the plain.
1. Let's take the 7th through 15th characters in the string:
Code:
>>> msg4 = "The rain in Spain falls mainly on the plain."
>>> print(msg4[6:14])
 in in Sp
2. Let's print the 5th character from the end to the end of the string:
Code:
>>> print(msg4[-5:])
lain.
3. Let's print every third character of the string:
Code:
>>> print(msg4[::3])
 T iiSiflmn  eln
Here, the final index that's printed in 42. The next jump, 45, is outside the range.

Repeating a string:
One can "multiply" a string, or more correctly, repeat a string, by using the * symbol. The syntax looks like the following:
Code:
 int * "stuff"
where 'int' is a positive integer and "stuff" is a string (**).

Example:

The usual battle of people in Misc. Math here goes as follows:
1. Some crank spews some garbage.
2. The intelligent people in the forum debunks said garbage.
3. The cycle repeats until the thread is locked or the crank leaves.

Let's describe it in a bit of Python:

Code:
>>> msg5 = "garbage, debunk, "
>>> print(3 * msg5 + "then the thread ends.")
garbage, debunk, garbage, debunk, garbage, debunk, then the thread ends.
(*) The module ctypes can be imported to give Python a C compatible char data type. But by default Python does not have a character data type.
(**) Actually, you can use a negative integer or 0 for the integer. But nothing will print.

Last fiddled with by Dylan14 on 2020-10-25 at 02:21 Reason: index 6 = 7th character! and similarly for index 14
Dylan14 is offline