PowerShell Cookbook

Search

Categories

 

On this page

Counting Lines of Source Code in PowerShell
Updated: PowerShell Profile Locations Rationale
Great PowerShell Screencasts
DasBlog Upgrade to 1.9 Complete!
O'Reilly PowerShell Quick Reference Now Available
Exploring the Glide Path of PowerShell
SecureStrings and Plain Text in PowerShell
PowerShell range operator for other types
Exchange team demonstrates their PowerShell integration
Value types vs Reference types

Archive

Blogroll

Disclaimer
I work for Microsoft.

The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

RSS 2.0 | Atom 1.0 | CDF

Send mail to the author(s) E-mail

Total Posts: 218
This Year: 18
This Month: 0
This Week: 0
Comments: 529

Sign In

 Wednesday, October 18, 2006
Wednesday, October 18, 2006 8:49:13 PM (Pacific Daylight Time, UTC-07:00) ( )

Oren Eini recently ran into some performance problems while using PowerShell to count the number of lines in a source tree:

I wanted to know how many lines of code NHibernate has, so I run the following PowerShell command...
(gci -Recurse | select-string . ).Count
The result:
(Graphic of PowerShell still working after about 5 minutes, using 50% of his CPU.)
Bummer.

The performance problem from this command comes from us preparing for a rich pipeline experience in PowerShell that you never use.  With only a little more text, you could have run even more powerful reports:

Line count per path:
   gci . *.cs -Recurse | select-string . | Group Path
Min / Max / Averages:
   gci . *.cs -Recurse | select-string . | Group Filename | Measure-Object Count -Min -Max -Average
Comment ratio: 
   $items = gci . *.cs -rec; ($items | select-string "//").Count / ($items | select-string .).Count

But if you don't need that power, there are alternatives that perform better.  Let's look at some of them.  We'll use a baseline of the command that Oren started with:

[C:\temp]
PS:14 > $baseline = Measure-Command { (gci . *.cs -Recurse | Select-String .).Count }

… and a comparison to the LineCount.exe he pointed to:

PS:15> $lineCountExe = Measure-Command { C:\temp\linecount.exe *.cs /s }
PS:16 > $baseline.TotalMilliseconds / $lineCountExe.TotalMilliseconds
41.5567286307833

(The Select-String approach is about 41.5x slower)

Since we don't need all of the PowerShell metadata generated by Select-String, and we don' t need the Regular Expression matching power of Select-String, we can instead use the [File]::ReadAllText() method from the .NET Framework:

PS:17 > $readAllText = Measure-Command { gci . *.cs -rec | % { [System.IO.File]::ReadAllText($_.FullName) } | Measure-Object -Line }
PS:18 > $readAllText.TotalMilliseconds / $lineCountExe.TotalMilliseconds
3.30927987204783

This is now about 3.3x slower – but is only 87 characters!  With a PowerShell one-liner, you were able to implement an entire linecount program.
If you want to go further, you can write a linecount program yourself:

## Get-LineCount.ps1
## Count the number of lines in all C# files in (and below)
## the current directory.

function CountLines($directory)
{
    $pattern = "*.cs"
    $directories = [System.IO.Directory]::GetDirectories($directory)
    $files = [System.IO.Directory]::GetFiles($directory, $pattern)

    $lineCount = 0

    foreach($file in $files)
    {
        $lineCount += [System.IO.File]::ReadAllText($file).Split("`n").Count
    }

    foreach($subdirectory in $directories)
    {
        $lineCount += CountLines $subdirectory
    }

    $lineCount
}

CountLines (Get-Location)

Now, about 2.7x slower – but in an easy to read, easy to modify format that saves you from having to open up your IDE and compiler.

PS:19 > $customScript = Measure-Command { C:\temp\Get-LineCount.ps1 }
PS:20 > $customScript.TotalMilliseconds / $lineCountExe.TotalMilliseconds
2.73733204860216

And to nip an annoying argument in the bud:

## Get-LineCount.rb
## Count the number of lines in in all C# files in (and below)
## the current directory

require 'find'

def filelines(file)
  count = 0
  while line = file.gets
     count += 1
  end
  count
end

def countFile(filename)
    file = File.open(filename)
    totalCount = filelines(file)
    file.close()
    totalCount
end   

totalCount = 0

files = Dir['**/*.cs']
files.each { |filename| totalCount += countFile(filename) }

puts totalCount

Which gives:

PS:21 > $rubyScript = Measure-Command { C:\temp\Get-LineCount.rb }
PS:22 > $rubyScript.TotalMilliseconds / $lineCountExe.TotalMilliseconds
3.0709602651302

Comments [5] | | # 
 Wednesday, October 04, 2006
Thursday, October 05, 2006 4:31:00 AM (Pacific Daylight Time, UTC-07:00) ( )

You may have noticed that we changed the location of the default profiles yet again! 

The one in "My Documents" is now under "WindowsPowerShell," rather than "PsConfiguration."
The one in "All User's Documents" is now under the installation root.

Since you are probably wondering why, I've updated my last explanation: The Story Behind the Naming and Location of PowerShell Profiles.

Comments [0] | | # 
 Tuesday, October 03, 2006
Tuesday, October 03, 2006 6:30:37 PM (Pacific Daylight Time, UTC-07:00) ( )

Several people have been churning out screencasts lately, and they are really great resources.

On the Channel9 front, David Aiken recently posted two DFO Show screencasts:

In addition, Doug Finke and the Lab49 team just posted an overview of some PowerShell features – Variables, Arrays, Hashtables, Functions, and PsObject.

For interactive learners, the screencast medium is a great resource.  Not only do you get to watch over somebody's shoulder, but the vocal accompaniment also provides the "Proximity Effect" benefit of working under the guidance of an expert.

Comments [3] | | # 
 Monday, October 02, 2006
Tuesday, October 03, 2006 6:39:46 AM (Pacific Daylight Time, UTC-07:00) ( )

Now that DasBlog 1.9 has been released, I've finally upgraded.  Everything appears to have gone well, and I'm really happy with some of the new themes available.  You won't necessarily notice this if you only read the site through your RSS reader, so why don't you come on down and check out the new look? :)

Please let me know if you see anything broken!

 

Comments [1] | | # 
 Friday, September 29, 2006
Friday, September 29, 2006 10:18:27 PM (Pacific Daylight Time, UTC-07:00) ( )

My O'Reilly PowerShell Quick Reference is now available – a 120-page guide (in PDF format) that provides the essential reference material for your day-to-day use of PowerShell.  With a concise explanation at your fingertips, there is no need to memorize esoteric topics like regular expressions and string formatting specifiers.

Aside from its straight factual reference material, the Quick Reference also provides an enormous amount of value by distilling large bodies of knowledge into their most usable forms, including:

  • The most useful classes from the .NET Framework
  • The most useful WMI classes for WMI-based tasks
  • The most useful COM automation objects

You can get it here: http://www.oreilly.com/catalog/windowspowershell/.  If you are interested in reviewing it for your blog or publication, let me know and I can try to arrange it.

Tip: If you decide to print it out, set your printer to do it double-sided, in booklet format.  Cut it in the middle, then staple it in the correct order -- it really looks great in that format.

[Edit: Please let me know if you would be interested in a printed version if it were available for slightly more.]

Comments [9] | | # 
 Wednesday, September 27, 2006
Wednesday, September 27, 2006 5:00:57 PM (Pacific Daylight Time, UTC-07:00) ( )

As people begin to explore PowerShell, the question often comes up – "Will PowerShell support the cmdlet or language feature <foo>?"

For example (from the newsgroup,) "Is there a way to get the unmodified contents of a file – since Get-Content splits it into lines?"  Or from Occasionally Rabid Programmer, "… Is there a zipping PowerShell class yet?"  For many of these questions, the answer is, "Yes. Since PowerShell gives you access to all of the .Net Framework, you could also ask 'Does the .Net Framework support <Foo> yet?'"

Clearly, PowerShell is evolving, and we have no plans to make you write everything yourself.

For many scenarios, though, people will often even solve their problem using classes from the .NET Framework, but worry that it isn't "PowerShell-y" enough.

There is often a distinction, but usually there is not.  We designed PowerShell to allow you to smoothly glide between typical scripting tasks and more complex programming tasks -- our support for .Net being a key to this spectacular amount of breadth.

For example, PowerShell will likely not ever support random I/O on a file stream through a cmdlet or language primitive.  That's a task that the .NET Framework handles well, using a model that users are familiar with.  It doesn't make much sense for PowerShell to re-implement all of that, just as there will always be sections of the Win32 API that .Net users need to call through P/Invoke techniques.

As you move away from goals obviously covered by language primitives and stock cmdlets, there comes a point where it just makes sense to leverage the powerful features of .NET interop.  Learning the breadth of the.NET Framework Library takes some patience, but is well worth it.  You can start by browsing and searching on http://msdn.microsoft.com/library/, and it is also a topic that my upcoming PowerShell Quick Reference deals with in some depth.

Comments [0] | | # 
 Thursday, September 07, 2006
Thursday, September 07, 2006 8:22:09 PM (Pacific Daylight Time, UTC-07:00) ( )

A question came up today on the newsgroup asking why the following didn't work:

PS >$secureString = ConvertTo-SecureString "Hello"
ConvertTo-SecureString : Cannot process argument because the value of argume
nt "input" is invalid. Change the value of the "input" argument and run the
operation again.
At line:1 char:39
+ $secureString = ConvertTo-SecureString  <<<< "Hello"
PS >

For some background -- a SecureString is a type of string that PowerShell (and .Net) keeps encrypted in memory.  Even if an attacker can explore the memory on your computer (like the contents of a swap file, for example,) they cannot gain access to the secret protected by the SecureString.

Although you can pass around SecureStrings with impunity, applications must be extremely careful at the boundaries -- when creating SecureStrings and retrieving the encrypted data from them.  This means doing things like reading your password input character by character, then removing each character from memory as soon as possible.  If the data is ever stored as a regular string, it stays in memory until the process exits.

By typing a regular string onto the command line (like in the example,) the string can no longer be made secure.  That specific string stays in memory until PowerShell exits.  This is why ConvertTo-SecureString only accepts the encrypted output of ConvertFrom-SecureString.  Only in that way can we retain the security guarantee of SecureStrings.

That said, most people aren't that concerned about an attacker spying on their machine's memory, or digging through their Windows pagefile.  In many situations, the benefit of being able to automate these situations vastly outweights the potential security risk.

For the upcoming release candidate, we've added some new functionality to allow this:

$secureString = ConvertTo-SecureString "Hello" -AsPlainText -Force

(The force flag lets you bypass the warning I just gave you :)   )

Until then, you can create SecureStrings from plain text this way:

$text = "Hello World"
$secureString = new-object Security.SecureString
$text.ToCharArray() | % { $secureString.AppendChar($_) }

[Edit: Ouch... my first dupe!]

Comments [0] | | # 
Thursday, September 07, 2006 4:59:38 PM (Pacific Daylight Time, UTC-07:00) ( )

The PowerShell range operator allows you to generate lists of numbers out of more compact expressions:

PS >1..5
1
2
3
4
5

PS >6..4
6
5
4

Something that came up on the newsgroup was the desire for range operators on data types other than the numeric ones.  It’s a feature we wanted to add, but weren’t able to.  In the meantime, this script / function accomplishes the goal quite well:

PS >range Friday Monday DayOfWeek
Friday
Thursday
Wednesday
Tuesday
Monday

PS >range y v char
y
x
w
v

## Range.ps1
## Support ranges on variable types other than numeric
## 
## range Stop Inquire System.Management.Automation.ActionPreference
## range Friday Monday DayOfWeek
## range e g char

param([string] $first, [string] $second, [string] $type)

$rangeStart = [int] ($first -as $type)
$rangeEnd = [int] ($second -as $type)

$rangeStart..$rangeEnd  | % { $_ -as $type }
Comments [2] | | # 
 Tuesday, September 05, 2006
Wednesday, September 06, 2006 5:59:26 AM (Pacific Daylight Time, UTC-07:00) ( )

Over the past few days, the Exchange team has been demo-blogging features of their new administrative interface.  The power is obvious, and the benefit it brings to administrators is palpable.

They’ve built their entire administration surface on PowerShell cmdlets, so today’s post really ties the series together nicely. 

They go through 14 of their most common scenarios with screen shots to illustrate all of the GUI actions.  Then, for each, they show some incredibly intuitive PowerShell one-liners that accomplish the same task (and more!)

The Exchange team deserves a lot of credit – this is truly a phenomenal advance in the administration world.  Even if you’ve seen it discussed before, you should definitely take a look:

"Management Console Overview": http://msexchangeteam.com/archive/2006/08/21/428728.aspx
"Recipient Management Overview": http://msexchangeteam.com/archive/2006/08/22/428740.aspx
"Recipient Management One-Liners": http://msexchangeteam.com/archive/2006/09/05/428833.aspx

Comments [0] | | # 
Tuesday, September 05, 2006 6:10:55 PM (Pacific Daylight Time, UTC-07:00) ( )

A question came up recently in the newsgroup, asking why adding seemingly different items to an ArrayList resulted in the ArrayList being full of the same item:

PS >$table = new-object Collections.ArrayList
PS >$row = 1 | select Col1,Col2,Col3
PS >$row.Col1 = "Column 1"
PS >$row.Col2 = "Column 2"
PS >[void] $table.Add($row)
PS >$row.Col1 = "Column 1, again"
PS >$row.Col2 = "Column 2, again"
PS >[void] $table.Add($row)
PS >$table

Col1                       Col2                       Col3
----                       ----                       ----
Column 1, again            Column 2, again
Column 1, again            Column 2, again


The source of the difference comes from the type of data that you assign to a variable.  Data falls into two categories: value types, and reference types.  Some data falls in the "value type" category, where its value is set in stone.  Changing the value actually creates a new one, as does passing it around:

PS > $a = "Foo"
PS > $b = $a
PS > $a += " Test"
PS > $a
Foo Test
PS > $b
Foo

Think of this like a promise you give to somebody.  If that person gives the same promise to somebody else, it's a new promise.  If you change your promise, it's a new promise yet again. 
 
Value types are the .Net primitives: string, int, bool, double, float, etc.
 
The other type category is called a "reference type."  Think of this like a bag you give to somebody.  If that person gives the bag to somebody else, it's still the same bag.  If you put stuff in the bag, take stuff out of the bag, or modify the bag -- it's still the same bag.
 
Most types are reference types.  Examples include ArrayList, StringBuilder, and some of the PowerShell primitives (ie: the Hashtable) -- since they are not .Net primitives.
 
The expression, "$row = 1 | select Col1, Col2, Col3" gives $row a PsObject -- which is a reference type.  When you change values in $row, you change it for everything that is talking about $row – including the copies inside the ArrayList.

In this case, the solution is to create new rows each time:

PS >$table = new-object Collections.ArrayList
PS >$row = 1 | select Col1,Col2,Col3
PS >$row.Col1 = "Column 1"
PS >$row.Col2 = "Column 2"
PS >[void] $table.Add($row)
PS >$row = 1 | select Col1,Col2,Col3
PS >$row.Col1 = "Column 1, again"
PS >$row.Col2 = "Column 2, again"
PS >[void] $table.Add($row)
PS >$table

Col1                       Col2                       Col3
----                       ----                       ----
Column 1                   Column 2
Column 1, again            Column 2, again

 

Comments [2] | | #