PS: String Data Mining

One thing that separates Windows PowerShell from other shells (in particular the typical Unix shell) is this: while most operating system shells are text-based, Windows PowerShell is object-based. As you might expect, there are pros and cons to these two different approaches; as a general rule, however, it’s fair to say that Windows PowerShell requires much less text and string manipulation than its fellow operating system shells.

So does that mean that you never have to do text and string manipulation in Windows PowerShell? No, sorry: text and string manipulation is still required for anyone writing system administration scripts. Fortunately, Windows PowerShell (and the .NET Framework) includes all sorts of nifty little functions for manipulating text and string values. Let’s take a peek at some of the more interesting things you can do with text. To that end, we’re going to work primarily with the following variables and the following values:

$a = "Scripting Guys"
$b = "scripting guys"

Comparing Two String Values

So what kind of things do you need to do with string values? Well, one very common task is to compare these values. The following command compares the string variable $a with the string variable $b, and stores the results in a third variable ($d):

$d = $a.CompareTo($b)

As you can see, we’re simply taking $a and calling the CompareTo method, passing the second string ($b) as the sole method parameter. If CompareTo returns a 0, that means the two strings are equal; anything other than a 0 means that the two strings are different. (Technically, a -1 means that $a is less than $b; a 1 means that $a is greater than $b.)

When we run this command and echo back the value of $d we get the following:

1

Which means that the two strings are different.

What’s that? Some of you think that the two strings aren’t different? Well, that depends on whether you do a case-sensitive comparison (where an uppercase S is considered a different character than a lowercase s) or a case-insensitive comparison, where S and s are considered the same character. The CompareTo method always does a case-insensitive comparison. To do a case-sensitive comparison, use this command instead:

$d = [string]::Compare($a, $b, $True)

In this case we’re using the .Net Framework’s System.String class (that’s what the syntax [string] indicates). We then call the static method (indicated by the two colons, ::) Compare, passing the method three parameters: the two strings we want to compare ($a and $b) and the Boolean value $True. This third parameter tells the Compare method whether it should ignore the letter case when making comparisons. A value of $True means that it should go ahead and ignore letter case. After running this command $d will be equal to this:

0

Which means that, provided you ignore letter case, the two strings are equivalent.

You can also use the StartsWith and EndsWith methods to quickly determine whether a value starts or ends with a specific string. Want to know if the value of $a starts with the string Script? Then use this command:

$d = $a.StartsWith("Script")

In turn, $d will be True if the target text was found and False if it wasn’t.

Or check to see if the value ends with the target text:

$d = $a.EndsWith("Script")

Note that these comparisons are also case-sensitive. To do a case-insensitive comparison use a command similar to this one:

$d = $a.ToLower().StartsWith("script")

What’s the ToLower method for? We’ll explain that in just a minute.

Changing Text Case

As we’ve just seen, letter case is sometimes important. Because of that, you might want to use the ToUpper or the ToLower methods to convert all the letters in a string value to the same letter case; that way, letter case won’t interfere with any comparisons you make. To convert all the characters in a string to their uppercase equivalents use a command like this:

$d = $a.ToUpper()

In turn, $d will be equal to the following:

SCRIPTING GUYS

Alternatively, use the ToLower method to convert letters to their lowercase equivalents. This command takes the preceding all-uppercase value and converts the letters to lowercase:

$d = $d.ToLower()

Run this command and $d will be equal to this:

scripting guys

See? We told you we’d explain what the ToLower method is for.

Checking For Strings Within Strings

So what else do people do with string values? Well, one very common task is determining whether or not a given substring can be found anywhere within that value. For example, suppose you need to know if the string ript appears anywhere in the value of $a (which, as you might recall, is the string value Scripting Guys). How can we determine that? Like this:

$d = $a.Contains("ript")

All we’re doing here is taking $a and calling the Contains method, passing the target text (ript) as the only method parameter. The Contains method returns True if the target text can be found (which it can in this example) and False if the target text cannot be found.

It’s important to note that the Contains method always does a case-sensitive search; if we looked for the string RIPT, Contains would tell us that this text could not be found. What if we don’t care about letter case, and only care about the actual characters themselves? Well, one way to handle that is to convert both the string variable ($a) and the target text (RIPT) to all-lowercase or all-uppercase characters. This command returns the value True:

$d = $a.ToLower().Contains("RIPT".ToLower())

Replacing Text in a String

If you peruse the Hey, Scripting Guy! archive (and, of course, you should regularly peruse the Hey, Scripting Guy! archive) you might be amazed at how many scripts require you to replace one bit of text with a different bit of text. For example, suppose $a has been assigned the following value:

$a = "The Scriptign Guys"

That’s no good; we spelled Scripting wrong. But that’s OK; this is easy enough to fix:

$a = $a.Replace("Scriptign", "Scripting")

As you can see, all we’ve done here is assign $a a new value: the value of $a after we’ve called the Replace method. Note that we pass this method two parameters: the value we want to replace (Scriptign) and the replacement text (Scripting). After we run this command $a will be equal to this:

The Scripting Guys

Here’s another example. Suppose $a is equal to this:

$a = "Microsoft Scripting Guys"

Suppose we want to get rid of Scripting Guys. (And, trust us, we know plenty of people at Microsoft who do.) Well, once again we call the Replace method, this time specifying an empty string (“”) as the replacement text (as well as a blank space before the word Scripting):

$a = $a.Replace(" Scripting Guys", "")

Once we run that command Scripting Guys will be gone and $a will be equal to this:

Microsoft

If only it was that easy, eh?

Returning a Portion of a String

Sometimes you don’t want the entire string value; you only want a portion of that string value. For example, when working with Active Directory you often get back names that look like this:

$e = "CN=Ken Myer"

In turn, you often find yourself writing code to strip away the CN= from the front of each name. Looking for an easy way to do that? Then you’ve come to the right place:

$e = $e.Substring(3)

All we’re doing here is taking our string variable and calling the Substring method. We pass Substring a single parameter: the starting position where we want to begin extracting characters. If we pass Substring a 3 (which we did) that means we want to start extracting characters from position 3 and – because we did not include the optional second parameter – we want to keep extracting characters until we reach the end of the string. In turn, that means $e will be equal to this:

Ken Myer

Ah, good point: The K in Ken Myer is the fourth character in the string; so how come we didn’t pass Substring a 4?

Believe it or not, there’s a simple explanation for that: the very first character in a string is considered character 0; character spots are actually numbered like this:

0

1

2

3

4

5

6

7

8

9

10

C

N

=

K

e

n

.

M

y

e

r

That’s why we pass a 3 rather than a 4.

And what about that mysterious second parameter? Actually, it’s not all that mysterious; it simply tells the Substring methods how many characters to extract. Include this second parameter and Substring takes only the specified number of characters; leave it out, and Substring starts at the specified character position and then takes all the remaining characters in the string.

With that in mind, starting with our original value of $e (CN=Ken Myer), what do you suppose $e will be equal to after we run this command:

$e = $e.Substring(3,3)

You got it:

Ken

Bonus Tip: Removing Characters From the Beginning of a String

Consider a folder containing a bunch of files similar to this (a sight familiar to digital camera users):

HIJK_111112.jpg
HIJK_111113.jpg
HIJK_111114.jpg
HIJK_111115.jpg

Suppose you want to remove the HIJK_ prefix from each of these file names. How can you do that? Well, here’s one way, using a string value instead of a file system object and file name property (although the approach is exactly the same):

$d = "HIJK_111112.jpg"
$e = $d.TrimStart("HIJK_")

Turning a String Into an Array

Believe it or not, there might very well be times when you find it useful to convert a string value to an array. For example, suppose you have a part number like this:

$e = "9BY6742W"

It’s very possible that each character in that part number has a specific meaning; for example, maybe the initial character (9) represents the plant in which the item was manufactured. In a case like that, you might want to look at each character individually, something that’s extremely easy to do if the entire string has been converted to an array of individual characters. But how can you convert a string value to an array of individual characters?

We’re glad you asked that question:

$d = $e.ToCharArray()

As you can see, we’re simply taking the string variable $e and calling the ToCharArray method (no parameters required). What will $d look like after we call this method? An awful lot like this:

9
B
Y
6
7
4
2
W

Reference: The String’s the Thing

Leave a Reply

Your email address will not be published. Required fields are marked *