Team LiB   Previous Section   Next Section

17.2 Manipulating Strings

The String class provides a host of methods for comparing, searching, and manipulating strings, the most important of which are shown in Table 17-1.

Table 17-1. String class methods

Method or property

Explanation

Chars

Property that returns the string indexer

Compare()

Overloaded public static method that compares two strings

Copy()

Public static method that creates a new string by copying another

Equals()

Overloaded public static and instance method that determines if two strings have the same value

Format()

Overloaded public static method that formats a string using a format specification

Length

Property that returns the number of characters in the instance

PadLeft()

Right-aligns the characters in the string, padding to the left with spaces or a specified character

PadRight()

Left-aligns the characters in the string, padding to the right with spaces or a specified character

Remove()

Deletes the specified number of characters

Split()

Divides a string, returning the substrings delimited by the specified characters

StartsWith()

Indicates if the string starts with the specified characters

Substring()

Retrieves a substring

ToCharArray()

Copies the characters from the string to a character array

ToLower()

Returns a copy of the string in lowercase

ToUpper()

Returns a copy of the string in uppercase

Trim()

Removes all occurrences of a set of specified characters from beginning and end of the string

TrimEnd()

Behaves like Trim(), but only at the end

TrimStart()

Behaves like Trim(), but only at the start

17.2.1 Comparing Strings

The Compare() method is overloaded. The first version takes two strings and returns a negative number if the first string is alphabetically before the second, a positive number if the first string is alphabetically after the second, and zero if they are equal. The second version works just like the first but is case-insensitive. Example 17-1 illustrates the use of Compare().

Example 17-1. Compare() method
using System;

namespace StringManipulation
{
   class Tester
   {
      public void Run()
      {
          // create some strings to work with
          string s1 = "abcd";
          string s2 = "ABCD";
          int result;  // hold the results of comparisons

          // compare two strings, case sensitive
          result = string.Compare(s1, s2);
          Console.WriteLine(
              "compare s1: {0}, s2: {1}, result: {2}\n", 
              s1, s2, result);            

          // overloaded compare, takes boolean "ignore case" 
          //(true = ignore case)
          result = string.Compare(s1,s2, true);
          Console.WriteLine("Compare insensitive. result: {0}\n", 
              result);            

      }

      [STAThread]
      static void Main()
      {
         Tester t = new Tester();
         t.Run();
      }
   }
}
Output:
compare s1: abcd, s2: ABCD, result: -1
Compare insensitive. result: 0

Example 17-1 begins by declaring two strings, s1 and s2, and initializing them with string literals:

string s1 = "abcd";
string s2 = "ABCD";

Compare() is used with many types. A negative return value indicates that the first parameter is less than the second, a positive result indicates the first parameter is greater than the second, and a zero indicates they are equal. In Unicode (as in ASCII), a lowercase letter has a smaller value than an uppercase letter; with strings identical except for case, lowercase comes first alphabetically. Thus, the output properly indicates that s1 (abcd) is "less than" s2 (ABCD):

compare s1: abcd, s2: ABCD, result: -1

The second comparison uses an overloaded version of Compare(), which takes a third Boolean parameter, the value of which determines whether case should be ignored in the comparison. If the value of this "ignore case" parameter is true, the comparison is made without regard to case. This time the result is 0, indicating that the two strings are identical:

Compare insensitive. result: 0

17.2.2 Concatenating Strings

There are a couple of ways to concatenate strings in C#. You can use the Concat() method, which is a static public method of the String class:

string s3 = string.Concat(s1,s2);

or you can simply use the overloaded concatenation (+) operator:

string s4 = s1 + s2;

Example 17-2 demonstrates both of these methods.

Example 17-2. Concatenation
using System;

namespace StringManipulation
{
   class Tester
   {
      public void Run()
      {
          string s1 = "abcd";
          string s2 = "ABCD";
          
          // concatenation method
          string s3 = string.Concat(s1,s2);
          Console.WriteLine(
              "s3 concatenated from s1 and s2: {0}", s3);

          // use the overloaded operator
          string s4 = s1 + s2;
          Console.WriteLine(
              "s4 concatenated from s1 + s2: {0}", s4);

      }

      static void Main()
      {
         Tester t = new Tester();
         t.Run();
      }
   }
}
Output:
s3 concatenated from s1 and s2: abcdABCD
s4 concatenated from s1 + s2: abcdABCD

In Example 17-2, the new string s3 is created by calling the static Concat() method and passing in s1 and s2, while the string s4 is created by using the overloaded concatenation operator (+) that concatenates two strings and returns a string as a result.

17.2.3 Copying Strings

Similarly, you can create a new copy of a string in two ways. First, you can use the static Copy() method:

string s5 = string.Copy(s2);

or, for convenience, you might instead use the overloaded assignment operator (=), which implicitly makes a copy:

string s6 = s5;

Example 17-3 demonstrates string copying.

Example 17-3. Copying strings
using System;

namespace StringManipulation
{
   class Tester
   {
      public void Run()
      {
          string s1 = "abcd";
          string s2 = "ABCD";
          
          // the string copy method
          string s5 = string.Copy(s2);
          Console.WriteLine(
              "s5 copied from s2: {0}", s5);

          // use the overloaded operator
          string s6 = s5;
          Console.WriteLine("s6 = s5: {0}", s6);
      }

      static void Main()
      {
         Tester t = new Tester();
         t.Run();
      }
   }
}
Output:
s5 copied from s2: ABCD
s6 = s5: ABCD

17.2.4 Testing for Equality

The .NET String class provides three ways to test for the equality of two strings. First, you can use the overloaded Equals() method and ask one string (say, s6) directly whether another string (s5) is of equal value:

Console.WriteLine(
    "\nDoes s6.Equals(s5)?: {0}",
    s6.Equals(s5));

You can also pass both strings to String's static method Equals():

Console.WriteLine(   
    "Does Equals(s6,s5)?: {0}"
    string.Equals(s6,s5));

Or you can use the String class' overloaded equality operator (==):

Console.WriteLine(
    "Does s6==s5?: {0}", s6 == s5);

In each of these cases, the returned result is a Boolean value (true for equal and false for unequal). Example 17-4 demonstrates these techniques.

Example 17-4. Are all strings created equal?
using System;

namespace StringManipulation
{
   class Tester
   {
      public void Run()
      {
          string s1 = "abcd";
          string s2 = "ABCD";
          
          // the string copy method
          string s5 = string.Copy(s2);
          Console.WriteLine(
              "s5 copied from s2: {0}", s5);

          // copy with the overloaded operator
          string s6 = s5;
          Console.WriteLine("s6 = s5: {0}", s6);

          // member method 
          Console.WriteLine( 
              "\nDoes s6.Equals(s5)?: {0}",  
              s6.Equals(s5)); 

          // static method 
          Console.WriteLine(    
              "Does Equals(s6,s5)?: {0}",  
              string.Equals(s6,s5)); 

          // overloaded operator 
          Console.WriteLine( 
              "Does s6==s5?: {0}", s6 == s5); 
      }

      static void Main()
      {
         Tester t = new Tester();
         t.Run();
      }
   }
}

 Output:
s5 copied from s2: ABCD
s6 = s5: ABCD

Does s6.Equals(s5)?: True
Does Equals(s6,s5)?: True
Does s6==s5?: True

The equality operator is the most natural of the three methods to use when you have two string objects. However, some languages, such as VB.NET, do not support operator overloading, so be sure to override the Equals() instance method as well.

17.2.5 Other Useful String Methods

The String class includes a number of useful methods and properties for finding specific characters or substrings within a string, as well as for manipulating the contents of the string. Example 17-5 demonstrates a few such methods. Following the output is a complete analysis.

Example 17-5. Useful methods of the String class
using System;

namespace StringManipulation
{
   class Tester
   {
      public void Run()
      {
          string s1 = "abcd";
          string s2 = "ABCD";
          string s3 = @"Liberty Associates, Inc. 
                provides custom .NET development, 
                on-site Training and Consulting";
           
          // the string copy method
          string s5 = string.Copy(s2);
          Console.WriteLine(
              "s5 copied from s2: {0}", s5);

          // Two useful properties: the index and the length
          Console.WriteLine(
              "\nString s3 is {0} characters long. ", 
              s5.Length);

          Console.WriteLine(
              "The 5th character is {0}\n", s3[4]);

          // test whether a string ends with a set of characters
          Console.WriteLine("s3:{0}\nEnds with Training?: {1}\n",
              s3, 
              s3.EndsWith("Training") );
          Console.WriteLine(
              "Ends with Consulting?: {0}",
              s3.EndsWith("Consulting"));

          // return the index of the substring
          Console.WriteLine(
              "\nThe first occurrence of Training ");
          Console.WriteLine ("in s3 is {0}\n", 
              s3.IndexOf("Training"));

          // insert the word excellent before "training"
          string s10 = s3.Insert(101,"excellent ");
          Console.WriteLine("s10: {0}\n",s10);

          // you can combine the two as follows:
          string s11 = s3.Insert(s3.IndexOf("Training"),
              "excellent ");
          Console.WriteLine("s11: {0}\n",s11);
      }

      [STAThread]
      static void Main()
      {
         Tester t = new Tester();
         t.Run();
      }
   }
}

Output:
s5 copied from s2: ABCD

String s3 is 4 characters long.
The 5th character is r

s3:Liberty Associates, Inc.
                provides custom .NET development,
                on-site Training and Consulting
Ends with Training?: False

Ends with Consulting?: True

The first occurrence of Training
in s3 is 103

s10: Liberty Associates, Inc.
                provides custom .NET development,
                on-sitexcellent e Training and Consulting

s11: Liberty Associates, Inc.
                provides custom .NET development,
                on-sitexcellent e Training and Consulting

The Length property returns the length of the entire string, and the index operator ([]) is used to find a particular character within a string:

Console.WriteLine(
    "\nString s3 is {0} characters long. ", 
    s5.Length);

Console.WriteLine(
    "The 5th character is {0}\n", s3[4]);

Here's the output:

String s3 is 4 characters long.
The 5th character is r

The EndsWith() method asks a string whether a substring is found at the end of the string. Thus, you might first ask s3 if it ends with "Training" (which it does not) and then if it ends with "Consulting" (which it does):

Console.WriteLine("s3:{0}\nEnds with Training?: {1}\n",
    s3, 
    s3.EndsWith("Training") );
Console.WriteLine(
    "Ends with Consulting?: {0}",
    s3.EndsWith("Consulting"));

The output reflects that the first test fails and the second succeeds:

Ends with Training?: False
Ends with Consulting?: True

The IndexOf() method locates a substring within a string, and the Insert() method inserts a new substring into a copy of the original string. The following code locates the first occurrence of "Training" in s3:

Console.WriteLine("\nThe first occurrence of Training ");
Console.WriteLine ("in s3 is {0}\n", 
    s3.IndexOf("Training"));

The output indicates that the offset is 101:

The first occurrence of Training
in s3 is 101

Then use that value to insert the word "excellent", followed by a space, into that string. Actually the insertion is into a copy of the string returned by the Insert() method and assigned to s10:

string s10 = s3.Insert(101,"excellent ");
Console.WriteLine("s10: {0}\n",s10);

Here's the output:

s10: Liberty Associates, Inc.
               provides custom .NET development,
               on-site excellent Training and Consulting

Finally, combine these operations to make a more efficient insertion statement:

string s11 = s3.Insert(s3.IndexOf("Training"),"excellent ");
Console.WriteLine("s11: {0}\n",s11);

with the identical result:

s11: Liberty Associates, Inc.
               provides custom .NET development,
               on-site excellent Training and Consulting

17.2.6 Finding Substrings

The String class has methods for finding and extracting substrings. For example, the IndexOf() method returns the index of the first occurrence of a string (or one or more characters) within a target string. For example, given the definition of the string s1 as:

string s1 = "One Two Three Four";

you can find the first instance of the characters "hre" by writing:

int index = s1.IndexOf("hre");

This code sets the int variable index to 9, which is the offset of the letters "hre" in the string s1.

Similarly, the LastIndexOf() method returns the index of the last occurrence of a string or substring. While the following code:

s1.IndexOf("o");

returns the value 6 (the first occurrence of the lowercase letter o is at the end of the word Two), the method call:

s1.LastIndexOf("o");

returns the value 15 (the last occurrence of o is in the word Four).

The Substring() method returns a series of characters. You can ask it for all the characters starting at a particular offset and ending either with the end of the string or with an offset you (optionally) provide. Example 17-6 illustrates the Substring() method.

Example 17-6. Finding substrings by index
using System;

namespace StringSearch
{
   class Tester
   {
      public void Run()
      {
          // create some strings to work with
          string s1 = "One Two Three Four"; 

          int index;

          // get the index of the last space
          index=s1.LastIndexOf(" ");
            
          // get the last word.
          string s2 = s1.Substring(index+1); 
            
          // set s1 to the substring starting at 0
          // and ending at index (the start of the last word
          // thus s1 has one two three
          s1 = s1.Substring(0,index);    
       
          // find the last space in s1 (after two)
          index = s1.LastIndexOf(" ");

          // set s3 to the substring starting at 
          // index, the space after "two" plus one more
          // thus s3 = "three"
          string s3 = s1.Substring(index+1);

          // reset s1 to the substring starting at 0
          // and ending at index, thus the string "one two"
          s1 = s1.Substring(0,index);

          // reset index to the space between 
          // "one" and "two"
          index = s1.LastIndexOf(" ");

          // set s4 to the substring starting one
          // space after index, thus the substring "two"
          string s4 = s1.Substring(index+1);

          // reset s1 to the substring starting at 0
          // and ending at index, thus "one"
          s1 = s1.Substring(0,index);

          // set index to the last space, but there is 
          // none so index now = -1
          index = s1.LastIndexOf(" ");

          // set s5 to the substring at one past
          // the last space. there was no last space
          // so this sets s5 to the substring starting
          // at zero
          string s5 = s1.Substring(index+1);
            
          Console.WriteLine ("s2: {0}\ns3: {1}",s2,s3);
          Console.WriteLine ("s4: {0}\ns5: {1}\n",s4,s5);
          Console.WriteLine ("s1: {0}\n",s1);
      }

      static void Main()
      {
         Tester t = new Tester();
         t.Run();
      }
   }
}
Output:
s2: Four
s3: Three
s4: Two
s5: One

s1: One

Example 17-6 is not the most elegant solution possible to the problem of extracting words from a string, but it is a good first approximation, and it illustrates a useful technique. The example begins by creating a string, s1:

string s1 = "One Two Three Four";

The local variable index is assigned the value of the last literal space in the string (which comes before the word Four):

index=s1.LastIndexOf(" ");

The substring that begins one position later is assigned to the new string, s2:

string s2 = s1.Substring(index+1);

This extracts the characters from index +1 to the end of the line (i.e., the string Four) and assigns the value Four to s2.

The next step is to remove the word Four from s1; assign to s1 the substring of s1 that begins at 0 and ends at the index:

s1 = s1.Substring(0,index);

After this line executes, the variable s1 will point to a new string object that will contain the appropriate substring of the string s1 used to point to. That original string will be destroyed by the garbage collector since no variable now references it.

You reassign index to the last (remaining) space, which points you to the beginning of the word Three. You then extract the characters "Three" into string s3. Continue like this until you've populated s4 and s5. Finally, display the results:

s2: Four
s3: Three
s4: Two
s5: One
s1: One

17.2.7 Splitting Strings

A more effective solution to the problem illustrated in Example 17-1 would be to use the String class's Split() method, which parses a string into substrings. To use Split(), pass in an array of delimiters (characters that indicate where to divide the words). The method returns an array of substrings (which Example 17-7 illustrates). The complete analysis follows the code.

Example 17-7. The Split() method
using System;

namespace StringSearch
{
    class Tester
    {
        public void Run()
        {
            // create some strings to work with
            string s1 = "One,Two,Three Liberty Associates, Inc. "; 

            // constants for the space and comma characters
            const char Space = ' ';
            const char Comma = ',';
    
            // array of delimiters to split the sentence with
            char[] delimiters = new char[] 
            {
                Space,
                Comma
            };

            string output = "";
            int ctr = 1;

            // split the string and then iterate over the
            // resulting array of strings

            String[] resultArray = s1.Split(delimiters); 

            foreach (String subString in resultArray) 
            { 
                output += ctr++; 
                output += ": "; 
                output += subString; 
                output += "\n"; 
            } 
            Console.WriteLine(output);

        }

        static void Main()
        {
            Tester t = new Tester();
            t.Run();
        }
    }
}
 Output:
1: One
2: Two
3: Three
4: Liberty
5: Associates
6:
7: Inc.

Example 17-7 starts by creating a string to parse:

string s1 = "One,Two,Three Liberty Associates, Inc.";

The delimiters are set to the space and comma characters. Then call Split() on the string, passing in the delimiters:

String[] resultArray = s1.Split(delimiters);

Split() returns an array of the substrings that you can then iterate over using the foreach loop as explained in Chapter 6.

foreach (String subString in resultArray)

You can, of course, combine the call to split with the iteration, as in the following:

foreach (string subString in s1.Split(delimiters))

C# programmers are fond of combining statements like this. The advantage of splitting the statement into two, however, and of using an interim variable like resultArray is that you can examine the contents of resultArray in the debugger.

Start the foreach loop by initializing output to an empty string, and then build up the output string in four steps. Start by concatenating the incremented value of ctr to the output string, using the += operator.

output += ctr++;

Next add the colon, then the substring returned by Split(), and then the newline.

output += ": ";
output += subString;
output += "\n";

With each concatenation, a new copy of the string is made, and all four steps are repeated for each substring found by Split().

This repeated copying of string is terribly inefficient. The problem is that the string type is not designed for this kind of operation. What you want is to create a new string by appending a formatted string each time through the loop. The class you need is StringBuilder.

17.2.8 The StringBuilder Class

You can use the System.Text.StringBuilder class for creating and modifying strings. Semantically, it is the encapsulation of a constructor for a string. Table 17-2 summarizes the important members of StringBuilder.

Table 17-2. StringBuilder members

Method or property

Explanation

Append()

Overloaded public method that appends a typed object to the end of the current StringBuilder

AppendFormat()

Overloaded public method that replaces format specifiers with the formatted value of an object

EnsureCapacity()

Ensures that the current StringBuilder has a capacity at least as large as the specified value

Capacity

Property that retrieves or assigns the number of characters the StringBuilder is capable of holding

Chars

Property that contains the indexer

Insert()

Overloaded public method that inserts an object at the specified position

Length

Property that retrieves or assigns the length of the StringBuilder

MaxCapacity

Property that retrieves the maximum capacity of the StringBuilder

Remove()

Removes the specified characters

Replace()

Overloaded public method that replaces all instances of specified characters with new characters

Unlike String, StringBuilder is mutable; when you modify an instance of the StringBuilder class, you modify the actual string, not a copy.

Example 17-8 replaces the String object in Example 17-7 with a StringBuilder object.

Example 17-8. The StringBuilder class
using System;
using System.Text;

namespace StringSearch
{
   class Tester
   {
      public void Run()
      {
          // create some strings to work with
          string s1 = "One,Two,Three Liberty Associates, Inc."; 

          // constants for the space and comma characters
          const char Space = ' ';
          const char Comma = ',';
    
          // array of delimiters to split the sentence with
          char[] delimiters = new char[] 
         {
             Space,
             Comma
         };

          // use a StringBuilder class to build the
          // output string
          StringBuilder output = new StringBuilder();
          int ctr = 1;

          // split the string and then iterate over the
          // resulting array of strings
          foreach (string subString in s1.Split(delimiters))
          {
              // AppendFormat appends a formatted string
              output.AppendFormat("{0}: {1}\n",ctr++,subString);            
          }
          Console.WriteLine(output);

      }

      [STAThread]
      static void Main()
      {
         Tester t = new Tester();
         t.Run();
      }
   }
}

Only the last part of the program is modified. Rather than using the concatenation operator to modify the string, use the AppendFormat() method of StringBuilder to append new formatted strings as you create them. This is much easier and far more efficient. The output is identical:

1: One
2: Two
3: Three
4: Liberty
5: Associates
6:
7: Inc.

Because you passed in delimiters of both comma and space, the space after the comma between "Associates" and "Inc." is returned as a word, numbered 6 in the previous code. That is not what you want. To eliminate this, you need to tell Split() to match a comma (as between One, Two, and Three), a space (as between Liberty and Associates), or a comma followed by a space. It is that last bit that is tricky and requires that you use a regular expression.

    Team LiB   Previous Section   Next Section