Blogcleaning, story ideas, a few other things
Posted by Jeremy on November 27th, 2007
So I’ve tidied up the blog a bit and streamlined some things1.
I removed the tag cloud because, well, they’re stupid and no one really uses them like they should, anyway. In its place sits a new blogroll of TC people. Blogs of people in the area I come across, I’ll just keep adding them to it. If you want yours added to the list, just leave a comment. I don’t really like how there isn’t really a definitive separation between individual links, I’ll have to see if I can’t fix that later.
I’ve also added a new “Minnesota Links” page - just a set of URLs I like to hit up that involves my home state. I’ll add to that as well when I come across new resources.
I went to bed last night seeing a dusting of snow outside. Thought I’d have a picture of the snowfall for you today, but when I woke up it was all gone. Though supposedly we’ll have up to an inch of snow tonight, so maybe tomorrow I’ll have the picture2
Also, I’m working on a long term story now. In my quest to prove “You can never really hide yourself on the internet” I’ve created a second blog. I’m using a completely new login ID, one that has nothing to do with me. I’m writing posts about things that are more out of scope then I normally do, and I’m rewording things currently happening or not writing about it at all. Nevertheless, in 6 months time or so, I will have an exhaustive article on performing internet forensics.
In computer forensics, scientists tear apart and systematically examine a computer bit by bit to find evidence3 of a crime. Internet forensics is the systematic examination of the internet for a person4 using known quantifiers about the suspect5. Much like handwriting analysis6 we will be looking for certain indicators that a particular author is responsible for a particular document. Everyone has a certain writing style, much like a fingerprint. A way we combine words, uncommon misspellings or grammar usage. We can use this to build up a list of key phrases our suspect might use, and look for those. Then, when found, we use the samples we have to verify how strongly this found site might be the person we’re looking for.
Eventually, I’d like to see if I can’t build a search engine that will allow you to search for your own written works, perhaps heavily sampled and posted under someone elses name.
How do we find samples? Actually, it’s a lot easier then you think. Your browser might have an old cached copy of text you want to compare, or the wayback machine might have found your site interesting enough to begin archiving - if so we could possibly have over 10 years of samples! And the more samples we have, the more accurate the matches!
Needless to say, I’m pretty stoked about it all. This project might be a tad out of my normal scope, programatically speaking, but I will at least have the more labor intensive manual method to show.
I’m also thinking about writing a little web robot that would screen scrape7 the local Craigslist “Missed Connections” section, which I read constantly89, then mash it up with Google maps. We could create this MC heat map of the Twin Cities. But I think it would require a lot of handwork, figuring out addresses and stuff. But still, a once a day check… After a while, trends would form. Anyone have any thoughts about it?
Anyway, I really need to learn to cut these posts up into tiny bite sized pieces but meh, I’ll work on it later.
Footnotes listed in the above post:- Plus added a sweet footnotes plugin so I can do this [↩]
- Damn, I remember trick or treating in the snow [↩]
- or supportive evidence [↩]
- or identity, or even a place for that matter [↩]
- in this case, me [↩]
- Graphology [↩]
- They offer an XML feed, but it’s just the title of the post, so it’s all but useless for this [↩]
- I think it would be a little cool if someone say me and put something in there about me [↩]
- Though I read it to laugh and feel sorry for the shy people, or the weird posts that make it in there [↩]




