Previous section   Next section

10.2 Manipulating Strings

The String class provides a host of methods for comparing, searching, and manipulating strings, the most important of which are shown in Table 10-1.

Table 10-1. String class methods

Method or field

Explanation

Chars

The string indexer

Compare( )

Overloaded public shared method that compares two strings

Copy( )

Public shared method that creates a new string by copying another

Equals( )

Overloaded public shared and instance method that determines if two strings have the same value

Format( )

Overloaded public shared method that formats a string using a format specification

Length

The number of characters in the instance

PadLeft( )

Right-aligns the characters in the string, padding to the left with spaces or a specified character

PadRight( )

Left-aligns the characters in the string, padding to the right with spaces or a specified character

Remove( )

Deletes the specified number of characters

Split( )

Divides a string, returning the substrings delimited by the specified characters

StartsWith( )

Indicates if the string starts with the specified characters

SubString( )

Retrieves a substring

ToCharArray( )

Copies the characters from the string to a character array

ToLower( )

Returns a copy of the string in lowercase

ToUpper( )

Returns a copy of the string in uppercase

Trim( )

Removes all occurrences of a set of specified characters from beginning and end of the string

TrimEnd( )

Behaves like Trim( ), but only at the end

TrimStart( )

Behaves like Trim( ), but only at the start

10.2.1 Comparing Strings

The Compare( ) method is overloaded. The first version takes two strings and returns a negative number if the first string is alphabetically before the second, a positive number if the first string is alphabetically after the second, and zero if they are equal. The second version works just like the first but is case insensitive. Example 10-1 illustrates the use of Compare( ).

Example 10-1. Compare( ) method
Namespace StringManipulation
    Class Tester

        Public Sub Run( )
            ' create some Strings to work with
            Dim s1 As String = "abcd"
            Dim s2 As String = "ABCD"
            Dim result As Integer ' hold the results of comparisons
            ' compare two Strings, case sensitive
            result = String.Compare(s1, s2)
            Console.WriteLine( _
              "compare s1: {0}, s2: {1}, result: {2}" _ 
              & Environment.NewLine, s1, s2, result)

            ' overloaded compare, takes boolean "ignore case" 
            '(True = ignore case)
            result = String.Compare(s1, s2, True)
            Console.WriteLine("Compare insensitive. result: {0}" _
               & Environment.NewLine, result)
        End Sub 'Run

        Shared Sub Main( )
            Dim t As New Tester( )
            t.Run( )
        End Sub 'Main
    End Class 'Tester
End Namespace 'StringManipulation

Output:
compare s1: abcd, s2: ABCD, result: -1
Compare insensitive. result: 0

This code uses the shared NewLine property of the Environment class to create a new line in the output. This is a very general way to ensure that the correct code sequence is sent to create the newline on the current operating system. As an alternative you can use vbNewLine from the Microsoft.VisualBasic namespace.

Example 10-1 begins by declaring two strings, s1 and s2, initialized with string literals:

Dim s1 As String = "abcd"
Dim s2 As String = "ABCD"

Compare( ) is used with many types. A negative return value indicates that the first parameter is less than the second, a positive result indicates the first parameter is greater than the second, and a zero indicates they are equal.

In Unicode (as in ASCII), a lowercase letter has a smaller value than an uppercase letter. Thus, the output properly indicates that s1 (abcd) is "less than" s2 (ABCD):

Compare s1: abcd, s2: ABCD, result: -1

The second comparison uses an overloaded version of Compare( ) that takes a third Boolean parameter, the value of which determines whether case should be ignored in the comparison. If the value of this "ignore case" parameter is true, the comparison is made without regard to case. This time the result is 0, indicating that the two strings are identical (without regard to case):

Compare insensitive. result: 0

10.2.2 Concatenating Strings

There are a couple ways to concatenate strings in VB.NET. You can use the Concat( ) method, which is a shared public method of the String class:

Dim s3 As String = String.Concat(s1, s2)

Or you can simply use the concatenation (&) operator:

Dim s4 As String = s1 & s2

These two methods are demonstrated in Example 10-2.

Example 10-2. Concatenation
Option Strict On
Imports System
Namespace StringManipulation
    Class Tester

        Public Sub Run( )
            Dim s1 As String = "abcd"
            Dim s2 As String = "ABCD"

            ' concatenation method
            Dim s3 As String = String.Concat(s1, s2)
            Console.WriteLine("s3 concatenated from s1 and s2: {0}", s3)

            ' use the overloaded operator
            Dim s4 As String = s1 & s2
            Console.WriteLine("s4 concatenated from s1 & s2: {0}", s4)
        End Sub 'Run

        Public Shared Sub Main( )
            Dim t As New Tester( )
            t.Run( )
        End Sub 'Main
    End Class 'Tester
End Namespace 'StringManipulation

Output:
s3 concatenated from s1 and s2: abcdABCD
s4 concatenated from s1 & s2: abcdABCD

In Example 10-2, the new string s3 is created by calling the shared Concat( ) method and passing in s1 and s2, while the string s4 is created by using the overloaded concatenation (&) operator that concatenates two strings and returns a string as a result.

Visual Basic .NET supports two concatenation operators (+ and &); however, the plus sign (+) is also used for adding numeric values, and the Microsoft documentation suggests using the & operator to reduce ambiguity.

10.2.3 Copying Strings

Creating a new copy of a string can be accomplished in two ways. First, you can use the shared Copy( ) method:

Dim s5 As String = String.Copy(s2)

Or for convenience, you might simply use the assignment operator (=), which will implicitly make a copy:

Dim s6 As String = s5

When you assign one string to another, the two reference types refer to the same String in memory. This implies that altering one would alter the other because they refer to the same String object. However, this is not the case. The String type is immutable. Thus, if after assigning s5 to s6, you alter s6, the two Strings will actually be different.

Example 10-3 illustrates how to copy strings.

Example 10-3. Copying strings
Option Strict On
Imports System
Namespace StringManipulation
    
   Class Tester
      
      Public Sub Run( )
         Dim s1 As String = "abcd"
         Dim s2 As String = "ABCD"
         
         ' the String copy method
         Dim s5 As String = String.Copy(s2)
         Console.WriteLine("s5 copied from s2: {0}", s5)
         
         ' use the overloaded operator
         Dim s6 As String = s5
         Console.WriteLine("s6 = s5: {0}", s6)
      End Sub 'Run
      
      Public Shared Sub Main( )
         Dim t As New Tester( )
         t.Run( )
      End Sub 'Main
   End Class 'Tester
End Namespace 'StringManipulation

output:
s5 copied from s2: ABCD
s6 = s5: ABCD

10.2.4 Testing for Equality

The .NET String class provides two ways to test for the equality of two strings. First, you can use the overloaded Equals( ) method and ask one string (say, s6) directly whether another string (s5) is of equal value:

Console.WriteLine("Does s6.Equals(s5)?: {0}", s6.Equals(s5))

A second technique is to pass both strings to the String class's shared method Equals( ):

Console.WriteLine("Does Equals(s6,s5)?: {0}", _ 
   String.Equals(s6, s5))

In each of these cases, the returned result is a Boolean value (True for equal and False for not equal). These techniques are demonstrated in Example 10-4.

Example 10-4. Are all strings created equal?
Option Strict On
Imports System
Namespace StringManipulation

    Class Tester

        Public Sub Run( )
            Dim s1 As String = "abcd"
            Dim s2 As String = "ABCD"

            ' the String copy method
            Dim s5 As String = String.Copy(s2)
            Console.WriteLine("s5 copied from s2: {0}", s5)

            ' copy with the overloaded operator
            Dim s6 As String = s5
            Console.WriteLine("s6 = s5: {0}", s6)

            ' member method
            Console.WriteLine("Does s6.Equals(s5)?: {0}", s6.Equals(s5))

            ' shared method
            Console.WriteLine("Does Equals(s6,s5)?: {0}", _ 
               String.Equals(s6, s5))

        End Sub 'Run

        Public Shared Sub Main( )
            Dim t As New Tester( )
            t.Run( )
        End Sub 'Main
    End Class 'Tester
End Namespace 'StringManipulation

Output:
s5 copied from s2: ABCD
s6 = s5: ABCD

Does s6.Equals(s5)?: True
Does Equals(s6,s5)?: True

10.2.5 Other Useful String Methods

The String class includes a number of useful methods and properties for finding specific characters or substrings within a string, as well as for manipulating the contents of the string. A few such methods are demonstrated in Example 10-5. Following the output is a complete analysis.

Example 10-5. Useful string methods
Option Strict On
Imports System
Namespace StringManipulation
   
    Class Tester

        Public Sub Run( )
            Dim s1 As String = "abcd"
            Dim s2 As String = "ABCD"
            Dim s3 As String = "Liberty Associates, Inc. provides "
            s3 = s3 & "custom .NET development"

            ' the String copy method
            Dim s5 As String = String.Copy(s2)
            Console.WriteLine("s5 copied from s2: {0}", s5)

            ' The length
            Console.WriteLine("String s3 is {0} characters long. ", _
               s3.Length)

            Console.WriteLine( )
            Console.WriteLine("s3: {0}", s3)

            ' test whether a String ends with a set of characters
            Console.WriteLine("s3: ends with Training?: {0}", _
               s3.EndsWith("Training"))
            Console.WriteLine("Ends with development?: {0}", _
                s3.EndsWith("development"))

            Console.WriteLine( )
            ' return the index of the string
            Console.Write("The first occurrence of provides ")
            Console.WriteLine("in s3 is {0}", s3.IndexOf("provides"))

            ' hold the location of provides as an integer
            Dim location As Integer = s3.IndexOf("provides")

            ' insert the word usually before "provides"
            Dim s10 As String = s3.Insert(location, "usually ")
            Console.WriteLine("s10: {0}", s10)

            ' you can combine the two as follows:
            Dim s11 As String = _
                s3.Insert(s3.IndexOf("provides"), "usually ")
            Console.WriteLine("s11: {0}", s11)

            Console.WriteLine( )
            'use the Mid function to replace within the string
            Mid(s11, s11.IndexOf("usually") + 1, 9) = "always!"
            Console.WriteLine("s11 now: {0}", s11)

        End Sub 'Run

        Public Shared Sub Main( )
            Dim t As New Tester( )
            t.Run( )
        End Sub 'Main
    End Class 'Tester
End Namespace 'StringManipulation

Output:
s5 copied from s2: ABCD
String s3 is 4 characters long.

s3: Liberty Associates, Inc. provides custom .NET development
s3: ends with Training?: False
Ends with development?: True

The first occurrence of provides in s3 is 25
s10: Liberty Associates, Inc. usually provides custom .NET development
s11: Liberty Associates, Inc. usually provides custom .NET development

s11 now: Liberty Associates, Inc. always! provides custom .NET development

The Length property returns the length of the entire string:

Console.WriteLine("String s3 is {0} characters long. ", _
   s3.Length)

Here's the output:

String s3 is 4 characters long.

The EndsWith( ) method asks a string whether a substring is found at the end of the string. Thus, you might ask s3 first if it ends with "Training" (which it does not) and then if it ends with "Consulting" (which it does):

Console.WriteLine("s3: ends with Training?: {0}", _
   s3.EndsWith("Training"))
Console.WriteLine("Ends with development?: {0}", _
    s3.EndsWith("development"))

The output reflects that the first test fails and the second succeeds:

s3: ends with Training?: False
Ends with development?: True

The IndexOf( ) method locates a substring within our string, and the Insert( ) method inserts a new substring into a copy of the original string. The following code locates the first occurrence of "provides" in s3:

Console.Write("The first occurrence of provides ")
Console.WriteLine("in s3 is {0}", s3.IndexOf("provides"))

The output indicates that the offset is 25:

The first occurrence of provides in s3 is 25

You can then use that value to insert the word "usually," followed by a space, into that string. Actually the insertion is into a copy of the string returned by the Insert( ) method and assigned to s10:

Dim s10 As String = s3.Insert(location, "usually ")
Console.WriteLine("s10: {0}", s10)

Here's the output:

s10: Liberty Associates, Inc. usually provides custom .NET development

Finally, you can combine these operations to make a more efficient insertion statement:

Dim s11 As String = s3.Insert(s3.IndexOf("provides"), "usually ")

10.2.6 Finding Substrings

The String class has methods for finding and extracting substrings. For example, the IndexOf( ) method returns the index of the first occurrence1fc of a string (or one or more characters) within a target string.

For example, given the definition of the string s1 as:

Dim s1 As String = "One Two Three Four"

you can find the first instance of the characters "hre" by writing:

Dim index As Integer = s1.IndexOf("hre")

This code will set the integer variable index to 9, which is the offset of the letters "hre" in the string s1.

Similarly, the LastIndexOf( ) method returns the index of the last occurrence of a string or substring. While the following code:

s1.IndexOf("o")

will return the value 6 (the first occurrence of the lowercase letter "o" is at the end of the word Two), the method call:

s1.LastIndexOf("o")

will return the value 15, the last occurrence of "o" is in the word Four.

The Substring( ) method returns a series of characters. You can ask it for all the characters starting at a particular offset, and ending either with the end of the string or with an offset you (optionally) provide.

The Substring( ) method is illustrated in Example 10-6.

Example 10-6. Finding substrings by index
Option Strict On
Imports System
Namespace StringSearch

    Class Tester

        Public Sub Run( )
            ' create some strings to work with
            Dim s1 As String = "One Two Three Four"

            Dim index As Integer

            ' get the index of the last space
            index = s1.LastIndexOf(" ")

            ' get the last word
            Dim s2 As String = s1.Substring(index + 1)

            ' set s1 to the substring starting at 0
            ' and ending at index (the start of the last word
            ' thus s1 has One Two Three
            s1 = s1.Substring(0, index)

            ' find the last space in s1 (after "Two")
            index = s1.LastIndexOf(" ")

            ' set s3 to the substring starting at 
            ' index, the space after "Two" plus one more
            ' thus s3 = "three"
            Dim s3 As String = s1.Substring(index + 1)

            ' reset s1 to the substring starting at 0
            ' and ending at index, thus the String "One Two"
            s1 = s1.Substring(0, index)

            ' reset index to the space between 
            ' "One" and "Two"
            index = s1.LastIndexOf(" ")

            ' set s4 to the substring starting one
            ' space after index, thus the substring "Two"
            Dim s4 As String = s1.Substring(index + 1)

            ' reset s1 to the substring starting at 0
            ' and ending at index, thus "One"
            s1 = s1.Substring(0, index)

            ' set index to the last space, but there is 
            ' none so index now = -1
            index = s1.LastIndexOf(" ")

            ' set s5 to the substring at one past
            ' the last space. there was no last space
            ' so this sets s5 to the substring starting
            ' at zero
            Dim s5 As String = s1.Substring(index + 1)

            Console.WriteLine("s1: {0}", s1)
            Console.WriteLine("s2: {0}", s2)
            Console.WriteLine("s3: {0}", s3)
            Console.WriteLine("s4: {0}", s4)
            Console.WriteLine("s5: {0}", s5)
        End Sub 'Run


        Public Shared Sub Main( )
            Dim t As New Tester( )
            t.Run( )
        End Sub 'Main
    End Class 'Tester
End Namespace 'StringSearch

Output:
s1: One
s2: Four
s3: Three
s4: Two
s5: One

Example 10-6 is not the most elegant solution possible to the problem of extracting words from a string, but it is a good first approximation and it illustrates a useful technique. The example begins by creating a string, s1:

Dim s1 As String = "One Two Three Four"

The local variable index is assigned the value of the last space in the string (which comes before the word Four):

index = s1.LastIndexOf(" ")

The substring that begins one space later is assigned to the new string, s2:

Dim s2 As String = s1.Substring(index + 1)

This extracts the characters from index +1 to the end of the line (i.e., the string "Four"), assigning the value "Four" to s2.

The next step is to remove the word Four from s1. You can do this by assigning to s1 the substring of s1 that begins at 0 and ends at the index:

s1 = s1.SubString(0,index);           

You reassign index to the last (remaining) space, which points you to the beginning of the word Three. You then extract the characters "Three" into string s3. You can continue like this until you've populated s4 and s5. Finally, you display the results:

s1: One
s2: Four
s3: Three
s4: Two
s5: One

10.2.7 Splitting Strings

A more effective solution to the problem illustrated in Example 10-6 would be to use the Split( ) method of String, which parses a string into substrings. To use Split( ), you pass in an array of delimiters (characters that will indicate where to divide the words). The method returns an array of substrings. Example 10-7 illustrates. The complete analysis follows the code.

Example 10-7. The Split( ) method
Option Strict On
Imports System
Namespace StringSearch

    Class Tester

        Public Sub Run( )
            ' create some Strings to work with
            Dim s1 As String = "One,Two,Three Liberty Associates, Inc."

            ' constants for the space and comma characters
            Const Space As Char = " "c
            Const Comma As Char = ","c

            ' array of delimiters to split the sentence with
            Dim delimiters( ) As Char = {Space, Comma}

            Dim output As String = ""
            Dim ctr As Integer = 0

            ' split the String and then iterate over the
            ' resulting array of strings
            Dim resultArray As String( ) = s1.Split(delimiters)

            Dim subString As String
            For Each subString In resultArray
                ctr = ctr + 1
                output &= ctr.ToString( )
                output &= ": "
                output &= subString
                output &= Environment.NewLine
            Next subString
            Console.WriteLine(output)
        End Sub 'Run

        Public Shared Sub Main( )
            Dim t As New Tester( )
            t.Run( )
        End Sub 'Main
    End Class 'Tester
End Namespace 'StringSearch

Output:
1: One
2: Two
3: Three
4: Liberty
5: Associates
6:
7: Inc.

Example 10-7 starts by creating a string to parse:

Dim s1 As String = "One,Two,Three Liberty Associates, Inc."

The delimiters are set to the space and comma characters:

Const Space As Char = " "c
Const Comma As Char = ","c
Dim delimiters( ) As Char = {Space, Comma}

Double quotes are used in VB.NET to signal a string constant. The c after the string literals establishes that these are characters, not strings.

You then call Split( ) on the string, passing in the delimiters:

Dim resultArray As String( ) = s1.Split(delimiters)

Split( ) returns an array of the substrings that you can then iterate over using the For Each loop, as explained in Chapter 3:

Dim subString As String
For Each subString In resultArray
    ctr = ctr + 1
    output &= ctr.ToString( )
    output &= ": "
    output &= subString
    output &= Environment.NewLine
Next subString

You increment the counter variable, ctr. Then you build up the output string in four steps. You concatenate the string value of ctr. Next you add the colon, then the substring returned by Split( ), then the newline:

ctr = ctr + 1
output &= ctr.ToString( )
output &= ": "
output &= subString
output &= Environment.NewLine

With each concatenation, a new copy of the string is made, and all four steps are repeated for each substring found by Split( ). This repeated copying of the string is terribly inefficient.

The problem is that the String type is not designed for this kind of operation. What you want is to create a new string by appending a formatted string each time through the loop. The class you need is StringBuilder.

10.2.8 The StringBuilder Class

The System.Text.StringBuilder class is used for creating and modifying strings. Unlike the String class, StringBuilder is mutable; when you modify an instance of the StringBuilder class, you modify the actual string, not a copy. Semantically, StringBuilder is the encapsulation of a constructor for a string. The important members of StringBuilder are summarized in Table 10-2.

Table 10-2. StringBuilder members

Method or property

Explanation

Append( )

Overloaded public method that appends a typed object to the end of the current StringBuilder

AppendFormat( )

Overloaded public method that replaces format specifiers with the formatted value of an object

Capacity

Property that retrieves or assigns the number of characters the StringBuilder is capable of holding

Chars

Property that contains the indexer

EnsureCapacity( )

Ensures that the current StringBuilder has a capacity at least as large as the specified value

Insert( )

Overloaded public method that inserts an object at the specified position

Length

Property that retrieves or assigns the length of the StringBuilder

MaxCapacity

Property that retrieves the maximum capacity of the StringBuilder

Remove( )

Removes the specified characters

Replace( )

Overloaded public method that replaces all instances of specified characters with new characters

Example 10-8 replaces the String object in Example 10-7 with a StringBuilder object.

Example 10-8. The StringBuilder class
Option Strict On
Imports System
Imports System.Text
Namespace StringSearch
    
    Class Tester

        Public Sub Run( )
            ' create some Strings to work with
            Dim s1 As String = "One,Two,Three Liberty Associates, Inc."

            ' constants for the space and comma characters
            Const Space As Char = " "c
            Const Comma As Char = ","c

            ' array of delimiters to split the sentence with
            Dim delimiters( ) As Char = {Space, Comma}

            Dim ctr As Integer = 0

            ' split the String and then iterate over the
            ' resulting array of Strings
            Dim resultArray As String( ) = s1.Split(delimiters)

            Dim output As New StringBuilder( )
            Dim subString As String
            For Each subString In resultArray
                ctr = ctr + 1
                output.AppendFormat("{0} : {1}" & _ 
                   Environment.NewLine, ctr, subString)
            Next subString
            Console.WriteLine(output.ToString( ))
        End Sub 'Run

        Public Shared Sub Main( )
            Dim t As New Tester( )
            t.Run( )
        End Sub 'Main
    End Class 'Tester
End Namespace 'StringSearch

Only the last part of the program is modified from the previous example. Rather than using the concatenation operator to modify the string, you use the AppendFormat( ) method of StringBuilder to append new, formatted strings as you create them. This is much easier and far more efficient. The output is identical:

1: One
2: Two
3: Three
4: Liberty
5: Associates
6:
7: Inc.

Delimiter Limitations

Because you passed in delimiters of both comma and space, the space after the comma between "Associates" and "Inc." is returned as a word, numbered 6 previously. That is not what you want. To eliminate this, you need to tell Split( ) to match a comma (as between "One", "Two", and "Three") or a space (as between "Liberty" and "Associates") or a comma followed by a space. It is that last bit that is tricky and requires that you use a regular expression.


  Previous section   Next section
Top