Today, my task was looking at all the solutions handed in for the Scripting Games Task #1. Boy, what a number of great solutions! You all did a very good job. My job now is to highlight what I liked best and what I think could be improved.
I am not going into detail with any specific solution. I am simply talking about some best practices.
Using -Include is Bad
The task was to examine a bunch of log files that reside in several subfolders and archive those who are older than x days. You all used Get-ChildItem to gather the files. Get-ChildItem is a great way of finding files, even recursively. However, Get-ChildItem has a bunch of parameters and modi you can use, and they can severely affect performance.
One thing I noted was that many of you used the parameter -Include to limit files to *.log files. This really does not make much sense because -Include is a client-side filter built-in only for backwards compatibility if a provider cannot support filters (that’s what you need when you want to search the registry). -Include can also be useful if you want to submit an array of search filters which is not the case here.
Instead of -Include, always use -Filter. This is the server-side filtering that is done internally by the FileSystem provider. It is much faster.
Using -Recurse is Bad
Some of you simply took the root folder and made Get-ChildItem recurse through its entire subfolder tree. While the task description did not mention anything, in real world scenarios, the files you are after may be located in a wild folder structure with many other files, so recursing through a tree of subfolders is potentially dangerous because it might take a long time.
You can use -Recurse, but if you can narrow down your search, then do it.
In this case, narrowing down the search is easy because all files reside in subfolders that are child to one common parent folder. Instead of -Recurse, you are much better off using wildcards like this:
Get-ChildItem -Path C:\Application\Log\*\*.log
Note that you can use multiple wildcards. In this scenario, I used one wildcard for all the subfolders that are on the same level, then another wildcard for the filename. Since I am not looking into any subsequent subfolders, it is completely ok to specify the file pattern in your Path statement.
Avoid .NET Code
To find old files, you need the date 90 days ago. You all did a good job using datetime math and subtracted a timespan from a date or used the AddDays() method built into any date.
One thing I like to mention is that this is a PowerShell contest. While PowerShell builds on top of the .NET framework, and while you can resort to direct .NET calls if anything else fails, it is good practice to stick to cmdlets if they are available.
That’s why I do not like code like [DateTime]::Now(). Instead, simply use Get-Date, and your script becomes much more friendly and easier to read. Remember that there is help available for cmdlets like Get-Date but there is no help built into PowerShell for low level .NET functionality.
The cutoff date could be calculated either way, and again no .NET raw code is necessary:
(Get-Date).AddDays(-90) (Get-Date) - (New-Timespan -Days 90)
Using a .NET method like AddDays() instead of having to use New-TimeSpan is ok though. Although you are resorting to low level .NET, this approach is easier to read and shorter. So my best practice is: using object methods (dynamic methods built right into objects) is ok. Using static .NET methods (like Now()) is only a last resort and should be avoided if possible because it is introducing more complexity.
Foreach vs. Foreach-Object
Some of you use a foreach() loop, others use Foreach-Object in a pipeline to process the files found. What is better?
foreach() is faster but it requires the data to be collected in memory first, so it burs potentially much more memory. Whenever you do file system operations where you don’t know the exact number of files you will encounter, always use Foreach-Object in a pipeline.
This will not only save memory but also provide real-time results: whenever the first file is found, it will be processed. With foreach(), files would only be processed after all files have been collected, and if something fails before, that’s it.
Don’t use Aliases, Always Use Parameters
What you do in an interactive PowerShell is up to you, because your code will not stay for others to look at.
Whenever you start writing a script, and especially when you hand it in to the Scripting Games, NEVER use aliases like “?”. Use Where-Object. Scripts should be easy to understand and not cryptic.
Also, never use positional parameters. Instead, use named parameters. It is just a matter of pressing TAB to autocomplete and a great sign of respect to your colleagues that need to read your code next week or next month.
Watch out which parameters you use and if they might break compatibility. For example, if you use Get-ChildItem with -File, then your script will no longer run on PowerShell 2.0 because -File was introduced in PowerShell 3.0. You should at least put a comment into your script.
The same is true for $PSItem which is just a new name for $_ inside of script blocks. It really does not make sense to use $PSItem. The only thing it does is breaking compatibility. Stick to $_ instead, at least for as long as there are still a lot of Windows XP machines out there that cannot update to PowerShell 3.0.
That’s it for now. Keep up the great work, guys!
Microsoft MVP PowerShell
If you like to set up a great PowerShell training in Europe, rent me! tobias.weltner(AT)email.de…!