Syntax Highlighting in PowerShell

Since we just released a CTP of PowerShell V2, I thought I’d share a handy little script to demonstrate one of the new APIs we introduced: Show-ColorizedContent.ps1.


This CTP introduces a new tokenizer API that lets you work with PowerShell script content the same way that our parser does — as a collection of items (tokens) that represent the underlying structure of that script Until now, any tool that works with the data of a PowerShell script needs to parse the script on its own — usually with fragile regular expressions or other means.


This often works, but usually falls apart on complex scripts:


image


In the first line, “Write-Host” is an argument to the Write-Host cmdlet, but gets parsed as a string. Fair enough, but the second line does not treat the argument the same way. In fact, since it matches a cmdlet name, the argument gets parsed as another cmdlet call. In the here string that follows, the Write-Host cmdlet name gets highlighted again, even though it is really just part of a string.


This is absolutely not a slam on the authors of existing highlighters — it’s just that we’ve now introduced something that makes life so much easier.


$content = [IO.File]::ReadAllText(“c:\temp\ContentTest.ps1″)
$errors = [System.Management.Automation.PSParseError[]] @()
[System.Management.Automation.PsParser]::Tokenize($content, [ref] $errors)

This API generates a collection of PSToken objects that give all the information you need to properly dissect a PowerShell script:

PS C:\Temp> [System.Management.Automation.PsParser]::Tokenize($content, [ref] $errors) | ft -auto

Content Type Start Length StartLine StartColumn EndLine EndColumn
——- —- —– —— ——— ———– ——- ———
Write-Host Command 0 10 1 1 1 11
Write-Host String 11 12 1 12 1 24
… NewLine 23 2 1 24 2 1
Write-Host Command 25 10 2 1 2 11
Write-Host CommandArgument 36 10 2 12 2 22
… NewLine 46 2 2 22 3 1
… NewLine 48 2 3 1 4 1
Write-Host Write-Host String 50 23 4 1 4 24
… NewLine 73 2 4 24 5 1
… NewLine 75 2 5 1 6 1
testContent Variable 77 12 6 1 6 13
= Operator 90 1 6 14 6 15
Write-Host Hello World String 92 30 6 16 8 3
… NewLine 122 2 8 3 9 1


This adds a whole new dimension to the way you can interact with PowerShell. Some natural outcomes are:



  • syntax highlighting
  • preparing a script for production (replacing all aliased commands with their expanded equivalent, etc)
  • script refactoring
  • FxCop / Style guideline checks
  • PowerTab :)

As a starter example, I’ve attached Show-ColorizedContent.ps1 — a script to colorize PowerShell scripts in a console window. Its primary goal is to support demonstrations of PowerShell snippets. For that, it adds line numbers to let you easily refer to portions of your script. It also includes a -HighlightRanges parameter to let you highlight specific ranges of the script. The -HighlightRanges parameter is an array of line numbers, which you can easily create using PowerShell’s standard array range syntax:


PS C:\Temp> .\Show-ColorizedContent.ps1 Show-ColorizedContent.ps1 `
>> -HighlightRanges (4..5+1+30..33)
>>

001 > #requires -version 2.0
002 |
003 | param(
004 > $filename = $(throw “Please specify a filename.”),
005 > $highlightRanges = @(),
006 | [System.Management.Automation.SwitchParameter] $excludeLineNumbers)
007 |
008 | # [Enum]::GetValues($host.UI.RawUI.ForegroundColor.GetType()) | % { Write-Host -Fore $_ “$_” }
(…)
026 | $highlightColor = “Green”
027 | $highlightCharacter = “>”
028 |
029 | ## Read the text of the file, and parse it
030 > $file = (Resolve-Path $filename).Path
031 > $content = [IO.File]::ReadAllText($file)
032 > $parsed = [System.Management.Automation.PsParser]::Tokenize($content, [ref] $null) |
033 > Sort StartLine,StartColumn
034 |
035 | function WriteFormattedLine($formatString, [int] $line)
036 | {
037 | if($excludeLineNumbers) { return }
038 |
039 | $hColor = “Gray”
040 | $separator = “|”


Enjoy — you can download it here.

5 Responses to “Syntax Highlighting in PowerShell”

  1. Wes Haggard writes:

    Cool I’m going to have to play with this and all the new features in version 2.0.

  2. /\/\o\/\/ writes:

    My first tought when I did find the Tokenizer :

    Sure I can do some good damage with this in PowerTab ;-)

    thanks

    Greetings /\/\o\/\/

  3. Marco Shaw writes:

    Just wanted to say thanks for blogging about this new feature. We appreciate it!

  4. Joel "Jaykul" Bennett writes:

    "Show"? Not to be a grammar nazi, but I’m assuming you didn’t just pick that out of a hat without thinking about it being a valid verb … do you have access to a different list of "ok" verbs than what’s been published in the early spec? http://www.microsoft.com/technet/scriptcenter/topics/winpsh/cmdline_std.mspx#EWHAC

    *(this is where I whine about documentation)*
    It sucks!

    *(this is where I thank you for an awesome script)*
    It rocks so much, I was willing to take your "are you a human" test to tell you so.

    *(this is where I link to my script for expanding aliases to full names)*
    http://powershellcentral.com/scripts/55

    :-)

  5. Lee writes:

    When it comes to cmdlet naming consistency, sensitivity to grammar is a virtue. Please be loud an vociferous if you ever see transgressions :)

    "Show" just recently got past the RFC stage — http://blogs.msdn.com/powershell/archive/2007/05/09/proposed-new-standard-verbs.aspx. It implies output with formatting — not to be consumed by downstream tools. Kind of like Out-*, but not to a device.

    Nice job on the script!

Leave a Reply