Hex Dumper in Monad
Monday, 17 October 2005
Edit: See an updated version: Here
The entry 'Hex Dumper in Monad' was posted
on October 17th, 2005 at 7:20 am
and is filed under Uncategorized.
You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.
No. 1 — October 19th, 2005 at 4:23 am
Very cool!
No. 2 — October 21st, 2005 at 11:10 pm
Quite interesting as always, but raises some questions.
What’s get-content? Is it like unix "cat"? What happens if you omit the default encoding? How you pass C# style Stream objects around?
Something like
"openread vsts.iso | readudf "autorun.exe" -data | format"
openread passes Read mode opened Stream to readudf which passes autorun.exe as Stream to format, which in turn determines it’s binary and outputs as hex by default. If format is omitted I am not sure what would happen. Maybe it could ask if you’d like to open autorun.exe and keep the original .iso open such that if autorun.exe wants to execute setup.exe, that would be read through readudf.
No. 3 — October 22nd, 2005 at 10:58 am
get-content takes a while to pass even a smallish zip file to the output.
Perhaps get-content should instead return an IEnumerable, which wraps a yield return loop. If I’m right, it should return immediately, without loading the file into memory before passing it to format-hex. format-hex will see the enumerator, and access it just fine. The file will be getting its read hits during formatting, so some buffering could reduce the disk activity.
This probably increases the *total* time spent, but spreading out the loading time should increase responsiveness.
sketched out for byte:
class FileContentEnumerator: IEnumerator[byte]
{
private string _fileName;
public FileContentEnumerator(string fileName) { … }
public IEnumerable[T] GetEnumerator()
{
using (File file = new File(_fileName))
{
byte nextByte;
while(nextByte = file.ReadByte())
{
yield return nextByte;
}
}
}
}
No. 4 — October 22nd, 2005 at 11:03 am
Okay, I think I was smoking crack.
It’s that initial conversion of $input to byte[], isn’t it?
A similar technique should apply, though. Create an IEnumerable[byte] that takes input and yields the converted byte. Then the conversion happens during formatting, yadda yadda.
I forget — does MSH *have* a yield?
No. 5 — October 24th, 2005 at 12:09 am
AC: Yes, get-content is the way to get content from any provider that supports it. In the FileSystem, that’s the equivalent of "type" or "cat." On the Function provider, it returns the content of a function. For example, "get-content Function:\Prompt."
The reason that this script takes so long is that MSH currently buffers the output of scripts and functions, but not filters (get-help about_filter.) The get-content operation itself actually returns content byte-by-byte.
Since I wrote it as a script, we have to wait for it to process the whole file before we see any output.
If I had written it as a filter, you would see the output stream immediately. That’s the closest we have to a yield, but it’s not exactly the same. However, filters are stateless, and do not provide the ability to write a single header, or lines of 16 bytes, for example.
No. 6 — October 25th, 2005 at 4:15 am
Right.. If you can define a filter that knew about state, or a function that didn’t buffer all the input, it’d probably work well for this.
How easy is it to define to categories of pipeline elements (function, filter…)?