Read the follow-up with in-depth analysis for many of the techniques used by our combatants!
Saw this super interesting read online over the weekend:
In this post, the author posits that he can crunch numbers from the Linux command line MUCH faster than Hadoop can!
If he can do that, surely we can also beat the Hadoop Cluster…then I started wondering how I would replicate this in PowerShell, and thus this challenge was born…
- Download the repo here (2gb!), unzip it and keep the first 10 folders
- This equates to ~3.5 GB, which is roughly the same data size from the original post
- Be sure to only parse the first 10 folders 🙂
- Iterate through all of those Chess Record files it contains(*.pgn) and parse each record out. We need to return a total count of black wins, white wins and draws. To read a PGN: