WebSockets from Scratch
In the web application world – especially single-page applications – smooth and fluid interaction is key. For many years, these applications have been doing a pretty good job of getting this fluid interaction though AJAX techniques and browser support for XMLHttpRequest. One issue, however, is that XMLHttpRequest requires that all of your communication go through an text-based HTTP protocol. Another issue is that XMLHttpRequest doesn’t let a server initiate communication back to connected clients. Instead, clients need to continuously poll the server to find out if it has anything to say.
To solve that need, browser vendors introduced the WebSocket protocol. The WebSocket protocol is a persistent, two-way TCP connection between a WebSocket client (traditionally a browser) and a server.
You might wonder: “That’s all well and good, but why should I care?” That’s a good question with a simple answer: http://www.drawonmybadge.com This is a really cool project designed by Tim McGuffin (@NotMedic). Whatever you draw on the website goes onto his badge – via a 64x32 LED board, and a bunch of more cool software.
Tim carried this around DEF CON this year, and people had lots of fun
But can we have some _real_ fun?
Digging into the Implementation
When you dig into the code a little bit, you can see how the browser does the communication. For every pixel you draw on the canvas, it converts it into a JSON object with a command called “DRAW”, and “DATA” that represents an X, Y, Color combination. Then it sends that string data to a connected WebSocket (exampleSocket.send(…)).
Partial Automation – Code Generation
And from there, the Mona Lisa made her first appearance on Tim’s badge
That was fun, but copying + pasting code that PowerShell generated into a browser still felt a little hacky. Why don’t I just talk to the WebSocket directly from PowerShell? There are a few C# libraries for doing this out there, but I thought it would be a fun / interesting project to implement the protocol from scratch. So that’s what we’re going to do
WebSockets from Scratch
The WebSocket protocol is defined by RFC6455, which goes through the protocol in great detail.
Initial Upgrade Request
For the first phase of the connection, the browser makes a standard TCP connection to the remote server. Here’s how to do that in PowerShell:
In that TCP connection, it makes a standard HTTP request, requesting an upgrade to the websocket protocol. Part of this HTTP request includes a
Sec-WebSocket-Key header, which is intended to ensure that random HTTP requests can’t be retargeted to WebSocket servers, and that WebSocket client requests can’t be targeted to arbitrary other TCP servers. Here’s an example, which hard-codes the key for demonstration purposes.
Once the server accepts the connection upgrade, a well-written client will verify that the key included in the server’s response was correctly derived from the Sec-WebSocket-Key you provided. This is what a server response looks like:
Once the connection has been upgraded, client applications send frames of data. This isn’t the raw data itself – there is a structure around the data to describe it appropriately to the server accepting the connection.
To hold the frame data in PowerShell, we start with defining a byte array. The first 8 bits are:
- FIN: 1 bit describing whether this is the final frame or not. In my experimentation with Chrome, this was always set to ‘1’.
- RSV1, RSV2, and RSV3: 1 bit each that should always be zero unless you’re implementing a special protocol on top of WebSockets.
- OPCODE: 4 bits. The one used by Chrome by default is “TEXT FRAME”, which has the value of 0001.
The next 8 bits (the next byte) are:
- MASK: Whether the data is “masked” by a random obfuscation key. This is recommended for the web browser developers themselves so that malicious web applications can’t cause arbitrary content to be written to the underlying TCP connection itself.
- Payload Length: If the content is less than 126 bytes, this represents the payload length directly. Otherwise it needs to be the value ‘126’ with the next two bytes representing the actual payload length.
Since we are always supposed to mask the data, we define our mask as ‘1. However, this is supposed to be a bit at the leftmost position in the byte, so we need to shift it left by 7 places.
Since the mask and the payload length need to share the same byte (MASK being the leftmost bit and Payload Length being the rest), we use the –bor (Binary Or) bitwise operator to combine them into a single byte and then add that byte to the frame.
In the situation where the message length is greater than 126, though, the next two bytes need to also say how long the payload is. In C# and PowerShell, the way to get the bytes of a number is through the BitConverter.GetBytes() API. This API returns the bytes as they are represented by your processor. On most processors, this is Little Endian, where the least significant digits (think the 1s and 10s columns in decimal) come first.
The WebSocket protocol, however, requires that these bytes are in “Network Order” (the opposite of Little Endian), so we need to reverse them.
After the frame lengh, we need to provide the actual masking key. This is 4 bytes. As mentioned earlier, these are values used to obfuscate the content being transmitted to the server so that malicious web applications can’t communicate at the TCP level directly to the websocket server. The obfuscation algorithm is a simple XOR, where the four bytes in the masking key are used in a round-robin fashion against bytes in the content. Because this is was just for fun and the security protection is irrelevant, I provided a static masking key of “0” since anything XOR’d by 0 doesn’t change anything. That way, I didn’t have to implement the masking algorithm and the server wouldn’t notice
Next, we get to add our actual content to the frame – in this case, a message representing the JSON of pixels that we want to draw.
Finally, we write this frame to the TCP connection and flush it
When all is said and done, pretty pictures show up on Tim’s badge
When I sent Rick Astley, Defender ATP alerted Microsoft IT that one of its computers had PowerShell making raw TCP connections to a newly-registered domain. Amazing!
I tried to WannaCry Tim’s badge, and he still hasn’t paid
So that’s how to go about WebSockets from Scratch. Hope this helps!