Beyond Impact Blog

Learn Powershell in 5 (More) Painless Steps: Data - Types

Feb 27, 2017 1:38:58 PM / by Cole McDonald

We've taken the time to learn how to use Powershell to grab a bunch of information.  We've discussed how to use it to do stuff and some of the types of things we can do with it.  It's time to dig a bit deeper into how we're dealing with the information we're gathering to make it easier, faster, stronger to use once we've gathered it.  This will get even more geeky than we've done so far.  Hold on to your keyboards, this could get messy.
 

Data is a plural

 
The singular for Data is Datum.  Any single point of Data is a Datum.  I did say we were going to get geeky, didn't I?  Flow control (loops and decisions) are interchangeable from language to language, system to system.  The main job of a devleoper is to deal with data.  Any funcitonal block of code can be visualized (visualised for the Queen's English folks in the audience) as a black box with an input and output.
 
If we look at the data flow for Microsoft's System Center Operations Manager (SCOM) at a really basic level, at its most complicated configuration, data from the computer being monitored travels through a gateway, then to the management server, then to the database and data warehouse.
 
SCOM Data Flow: Server -> Gateway -> Management Server -> Databases
 
Each of the major stops along the way does something with the data.  It collects it, compresses it, bends, folds, and rolls it.  We don't know quite what is going on behind the scenes, but it doesn't matter.  All we need to know is what type of data needs to go into each part and what comes out.  If we consider a scenario that we create ourselves, we can imagine this type of flow:
 
CPU % Utilization -> do something if it's > 50% -> perform the action
 
The first point here is the computer.  We don't necessarily care how it's collecting the data.  For the purposes of this exercise, all we need to know is that it delivers it from perfmon as a floating point value from 0-1 (0.00 represents 0%, 0.50 represents 50%, 1.00 represents 100%).  The function we'll be writing that performs the action at the end will be triggered if the output of the middle bit is $true.  So our middle bit has to turn the floating point value from the first bit into a boolean to deliver to the last bit.
 
function FigureOutIfWeNeedToDoSomething {
 param ( [float]$CPUUtilization )
 
 if ( $CPUUtilization -ge 0.5 ) {
  return $true
 } else {
  return $false
 }
}
 
Each cmdlet and function we deal with will have an expectation of what it can accept as input.  For predefined cmdlets, we can perform a get-help on them to figure out what they're expecting for each piece we can provide them.  The help should also tell us what the output type will be.  In Powershell, most cmdlet output is done as a data object of some sort with a collection of datum.  This allows for much more complexity in data exchange.
 
A series of paramaters can be mixed, matched, shaken, and stirred to provide a broad array of output.  The object is our friend as well in Powershell as the pipeline system is made to allow multiple objects to be sent and processed at the same time (parallel processing).  The only time this isn't the case is if the cmdlet / function needs all of the results of the previous piece of the pipeline to perform its operation.  In these cases, this function should be at the end of the pipeline as often as possible:
 
Sort-Object # Requires all objects to perform its function
Format-List # Requires all objects to perform its function
 
Basically, anything that operates on a group of objects rather than a single object will act as a serial process and be a bottleneck in our processing chain.
 

Data Types

 
We know Powershell likes object and is even optimized to operate using them.  These objects are made of smaller pieces of data with specific types.  Data types are an important discussion as it informs us how the computer is storing our data and how it can process them (I know "them" looks weird but remember, data is plural).  We'll try to break down the data types as a progression from simplest to most complex.
 

Bits:


[BOOL]
Boolean is a single bit: 0 or 1, on or off.  In RAM and on DISK it's stored exactly that way.  This is what computers are made of.  Digital Star Stuff.

$true and $false are stored this way.  These RESERVED variables are defined in the language for ease of use.  Internally, they're stored as a 1 and 0 respectively.  The parameter type [SWITCH] is also a boolean which defaults to $false.  This allows you to attach flags to your commands.  I use this often while troubleshooting to show the point in the code where errors are happening:
 
function Get-ServerMetrics {
 param (
  [switch]$testing
 )
 if ($testing) {write-output "Gathering Agent Info"}
 $agents = Get-SCOMAgent -ComputerName "SCOM-ManagementServer.domain.tld"
 if ($testing) {write-output "Returning Info"}
 return $agents
}
#show with flow indicators
Get-ServerMetrics -testing
#show without flow indicators
Get-ServerMetrics
 
This allows for simple identification of where a script is stopping within a function.  The switch can be used for defining types of output as well.

-verbose is a built in flag for most cmdlets.
-Console is one I often add that turns on/off console output (write-output)
-email I've added to direct output to an smtp server
 
We still have to write the code for these, but internally, it's used the same as that testing code was:
 
if ($console) {
 # Send the output variable we've built to the console
 write-output $outputVariable
}
if ($email) {
 #send the output variable we've build as the body of an email
}
 
If we string together a set of boolean storage, we can make numbers using the binary numbering system.  Luckily for us, we don't need to know the value of 01101001.  The decimal conversion is built in for us.  We're going to have to know two number classifications here: integer and floating point.  Integer are defined as any real number positive or negative, including zero, with no significant decimal places.  Floating point numbers are any real number positive or negative even those with decimals.
 
Some of this info is directly from Powershell [classname]::minValue & [classname]::maxValue . That :: calls into the class, intellisense will let you see more available methods and properties as you're typing.
 
The rest of the information comes from the amazing Ed Wilson, Microsoft Scripting Guy: https://blogs.technet.microsoft.com/heyscriptingguy/2015/01/26/understanding-numbers-in-powershell/
 

Integers:


[BYTE]
Unsigned integer : positive only.
These are stored in memory as a series of 8 bits.
0 through 255

[SBYTE]
Signed integer : positive and negative, no decimal places
These are stored in memory as a series of 8 bits.
-128 through +127

[INT16]
Signed integer : positive and negative, no decimal places.
These are stored in memory as a series of 16 bits. Also called [USHORT]
-32768 through +32767
 
[UINT16]
Unsigned integer : positive only.
These are stored in memory as a series of 16 bits.  Also called [SHORT]
0 through +65535
 
[INT32]
Signed integer : positive and negative, no decimal places.
These are stored in memory as a series of 32 bits. Also called [INT]
-2147483648 through +2147483647
 
[UINT32]
Unsigned integer : positive only.
These are stored in memory as a series of 32 bits.
0 through +4294967295
 
[INT64]
Signed integer : positive and negative, no decimal places.
These are stored in memory as a series of 16 bits. Also called [LONG]
-9223372036854775808 through +9223372036854775807
 
[UINT64]
Unsigned integer : positive only.
These are stored in memory as a series of 16 bits.  Also called [ULONG]
0 through +18446744073709551615
 

Decimals:

 
[SINGLE]
32 bit floating point number
7 significant digits.  Also called [FLOAT]
-3.402823E+38 through +3.402823E+38

[DOUBLE]
64 bit floating point number
15 significant digits
-1.79769313486232E+308 through 1.79769313486232E+308
 
[DECIMAL]
128 bit floating point number
28 significant digits
-79228162514264337593543950335 through 79228162514264337593543950335
 

Things:

 
[ARRAY]
A series of arbitrary elements stored in association with an index (position in array starting at 0)
Think of it as one of those parts sorting bins with all the little storage cubes
Give each a number and you can find it by number
 
[HASHTABLE]
A series of key / value pairs stored in an array.
Used to reference array parts by name rather than index.  Index still works as well.
 

Strings:

 
[CHAR]
16 bit unicode character
Each value in the range represents a single unicode character
These are still stored as numbers according to the system [char]65 is "A", [char]97 is "a".
 
[STRING]
array of [CHAR]
 
Since the <space> is just another character, the STRING can contain any word, sentence, paragraph, etc.
You can use quotes to instead of explicitly stating the class.
"String Goes Here" - Double quotes allow inserting variables into the string "String Goes $locationVariable"
'String Goes Here' - Single quotes don't allow the variable to be evaluated. It will actually show as a dollar sign followed by the variable name.
 
Knowing this, we can convert to uppercase using math by subtracting 32 from the unicode value.
 
[CHAR]$inputChar = "a"
# Subtract 32 to get the uppercase value
$inputChar - 32
#Cast it to CHAR to show the letter
[CHAR]($inputerChar - 32)
 
Here's an actual chunk of code that processes a whole string by index:
 
$inputString = "simple uppercase example"
$outputString = ""
Write-Output $inputString
for ($index = 0; $index -le ($inputString.length - 1); $index++) {
 $thisCharacter = [INT]$inputString[$index]
    $outputString += [CHAR]($thisCharacter - 32)
}
Write-Output $outputString
 
To be complete about it, we'd want to verify that the value of each char was within the range of lower case letters (hint: there are 26 letters in the alphabet)
 
With Array and Hashtables, we can think of them as containers to store the other types.  Most interestingly, we can even store arrays and hashtables within arrays and hashtables.  In the Learn Powershell in 5 Painless Steps series, we actually did a little bit of this as we added hashtables of server information into an array.  This allows for some wonderfully complex data structures.  If we look at how a server's information is stored in SCOM, it's a hashtable of data (AN hashtable?) about an agent which includes quite a few one off pieces of information as well as arrays and hashtables.
 
By retrieving an agent using Get-SCOMAgent and assigning that to a variable, we load in all of that data into an array.  Each object in that array contains the server information and arrays of information that can have multiple pieces, such as CPU as well as links to other types of objects like the ClassInstances of each object being monitored from that server.
 
As we progress through this series, we're going to look at how we can use these data types to create our own custom objects for datasets we intend to work with.  Storage, Movement, Manipulation and, ultimately, How we can tie all of those bits together to actually use customer data structures to gather, analyze and distribute data.
 

------------------------------------------------------------------------

Did you find this article useful?  Let me know at cole.mcdonald@beyondimpactllc.com

If you want to be kept informed, follow our RSS feed: http://blog.beyondimpactllc.com/blog/rss.xml

Learn more about PowerShell in Azure

Beyond Impact is a Cloud Hosting and Managed Services provider based in Minneapolis, Minnesota. 
You can learn more about our Cloud Services at beyondimpactllc.com/azure-services/.

Tags: Data, 5 painless steps, powershell

Cole McDonald

Written by Cole McDonald

Internet Pioneer, Digital Futurist

Subscribe to Email Updates

Recent Posts