Extracting and monitoring web content with PowerShell


Extract PowerShell

This kind of request comes up all the time on StackOverflow and /r/PowerShell.

“How can I extract content from a webpage using PowerShell”.

And it’s an interesting problem to solve.  However, nothing motivates like greed, and I recently revisited this topic in order to help me track down the newest must-have item, the Switch.

In fact, this post COULD have been called ‘Finding a Nintendo Switch with PowerShell’!

I have been REALLY wanting a Nintendo Switch, and since I’ll be flying up to NYC next month for Tome’s NYC TechStravaganza (come see me if you’ll be in Manhattan that day!), it’s the perfect justification for She-Who-Holds-The-Wallet for me to get one!

But EVERYWHERE is sold out.  Still!  😦

However, the stores have been receiving inventory every now and then, and I know that when GameStop has it in stock, I want to buy it from them!  With that in mind, I knew I just needed a way to monitor the page and alert me when some text on it changes.

Web scraping, here we go!

Caveat: Scraping a site isn’t illegal, but it might void the terms of some sites out there.  Furthermore, if you scrape too often, you might be blocked from the site temporarily or forever.  Don’t get greedy in scraping, or try to use it commercially.

If a site provides an API, go that route instead, as API are sanctioned and provided by the company to use, and require 1% of the resources of loading a full page.

Finally, some Content Management Systems will never update an existing page, but create a new one with a new URL and update all links accordingly.  If you’re not careful, you could end up querying a page that will never change. 

GameStop Nintendo Switch with Neon Joycons

First thing’s first, let’s load this page in PowerShell and store it in a variable, we’ll be using Invoke-WebRequest to handle this task.

$url ='http://www.gamestop.com/nintendo-switch/consoles/nintendo-switch-console-with-neon-blue-and-neon-red-joy-con/141887'
$response = Invoke-WebRequest -Uri $url

Next, I want to find a particular element on the page, which I’ll parse to see if it looks like they have some in stock. For that, I need to locate the ID or ClassName of the particular element, which we’ll do using Chrome Developer Tools.

On the page, right-click ‘Inspect Element‘ on an element of your choosing.  In my case, I will right-click on the ‘Unavailable’ text area.

This will launch the Chrome Developer Console, and should have the element selected for you in the console, so you can just copy the class name.  You can see me moving the mouse around, I do this to see which element is the most likely one to contain the value.


You want the class name, in this case ats-prodBuy-inventory.  We can use PowerShell’s wonderful HTML parsing to do some heavy lifting here, by leveraging the HTMLWebResponseObject‘s useful ParsedHTML.getElementsByClassName method.

So, to select only the element in the body with the class name of ats-prodBuy-inventory, I’ll run:


This will list ALL the properties of this element, including lots of HTML info and properties that we don’t need.

To truncate things a bit, I’ll select only properties which have text or content somewhere in the property name.

$rep.ParsedHtml.body.getElementsByClassName($classname) | select *text*,*content*

The output:

innerText         : Currently unavailable online
outerText         : Currently unavailable online
parentTextEdit    : System.__ComObject
isTextEdit        : False
oncontextmenu     : 
contentEditable   : inherit
isContentEditable : False

Much easier to read.  So, now I know that the innerText or outerText properties will let me know if the product is in stock or not.  To validate, I took a look at another product which was in stock, and saw that it was the same properties.

All that remained was to take this few-liner and and convert it into a script which will loop once every 30 mins, with the exit condition of when the message text on the site changes.  When it does, I’m using a tool I wrote a few years ago Send-PushMessage, to send a PushBullet message to my phone to give me a head’s up!

$url ='http://www.gamestop.com/nintendo-switch/consoles/nintendo-switch-console-with-neon-blue-and-neon-red-joy-con/141887'

While ($($InStock -eq $notInStock)){
$response = Invoke-WebRequest -Uri $url
$classname ='ats-prodBuy-inventory'
$notInStock = 'Currently unavailable online'

$InStock = $response.ParsedHtml.body.getElementsByClassName($classname) | select -expand innertext
"$(get-date) is device in stock? $($InStock -ne $notInStock)`n-----$InStock"
Start-Sleep -Seconds (60*30)
Send-PushMessage -Type Message -title "NintendoSwitch" -msg "In stock, order now!!!!"

This is what I’ve been seeing…but eventually I’ll get a Push Message when the site text changes, and then, I’ll have my Switch!

Willing to help!

Are you struggling to extract certain text from a site?  Don’t worry, I’m here to help!  Leave me a comment below and I’ll do my best to help you.  But before you ask, checkout this post on Reddit to see how I helped someone else with a similar problem.

reddit/r/powerhsell: Downloading News Articles from the Web



12 thoughts on “Extracting and monitoring web content with PowerShell

  1. fxslayer March 30, 2017 / 2:08 pm

    I have used this on IE & Chrome but currently have data embedded inside ‘EO.Web’ controls (Essential Objects) java wrapper – and my Webpage that is inside the wrapper has hidden elements which are not recognized by inspect.exe OR UIspy or UIAutomation spy.


    • FoxDeploy March 30, 2017 / 4:02 pm

      If the data is being loaded by a Java connection, you should use fiddler to examine the connection and see if you can replicate it. If this is publicly accessible, I can help.


    • Nas November 19, 2018 / 8:12 am

      This is just what I’m looking for though I have an issue ( Forgive my ignorance I’m quite new to Powershell). I’m trying to write a script that will query a website to check if the latest version of a particular software is available by querying the ‘date’ class (which is a class name repeated over the page) within a tr id. (‘download-2209’) How would I modify the line below to enable this


      If it has changed I then want it to send a mail out and download it

      it’s a secure site so I’ve added the line below. to the start of the script.

      Any help is much appreciated


      • FoxDeploy November 19, 2018 / 10:57 am

        Hi! Please post your code as a github gist or pastebin link and share it with me. You can email me as well if it’s very secure and you’re concerned ☺


  2. Paul September 27, 2017 / 2:32 pm

    I’m going through this now. First off, awesome real world example of how to practically scrape websites with PS! A quick nitpicky correction. You should change your code from $rep.ParsedHtml.body.getElementsByClassName(‘ats-prodBuy-inventory’) to $response.ParsedHtml.body.getElementsByClassName(‘ats-prodBuy-inventory’). It’s correct in the screenshot, but not the text preceeding the screenshot.


  3. Amol Dhaygude November 9, 2018 / 12:23 pm

    Hi, this is working scripts with Internet but I have requirement of internal printer site open and extract the data but it’s not happening, could you please help me


    • FoxDeploy January 10, 2019 / 5:24 pm

      Sure, I’ll take a look at it tomorrow, should be possible!


    • FoxDeploy January 11, 2019 / 10:58 pm

      Alright, I took a stab and wrote it up here.

      Enjoy, it was a fun little challenge.


  4. DanD February 27, 2019 / 1:59 pm

    I know its been awhile but im looking to write a fun little script that pulls the current food truck and perhaps the next one or two as well from seattlefoodtruck.com. For instance “https://www.seattlefoodtruck.com/schedule/plaza-east” The javascript that runs outputs the details but no matter what i do i cannot access the contents of the output. Any help on what im missing?


    • FoxDeploy March 3, 2019 / 11:32 am

      Here you go, here’s an explanation and the working code to do what you’re trying to do 🙂

      The key is to monitor the request in Chrome Dev tools and filter down to just XHR requests, which is what will be used normally to populate sublists or reactive content in webpages.


Have a code issue? Share your code by going to Gist.github.com and pasting your code there, then post the link here!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.