2023-06-16 03:03:27
2023-06-16 03:03:26
2023-06-16 02:55:08
1614988
Someone kicked a decade of seminal pre-internet communications off the internet.
We all know Google Search has seriously degraded, with tons of duplicate and garbage content from content farms (which I’m sure carry lots of Google ads, so perhaps they don’t care—we're not the customer). But also, searching for my own name (which is globally unique) no longer returns nearly as much as it used to. It used to have hundreds if not thousands of hits to various mailing lists archives, not to mention old Usenet posts, and everything I've written online since.
So for fun, I did a search and ran down the list. Basically, after about 44 results, it’s just a mix of Mastodon posts (often reshares, and not including my profile), an occasional random mailing list post, and references to my megapost on Pseudonyms from the Google+ nymwars.
But here’s the shocker.
The Usenet results are gone.
When I set the date restrictions on my name search, I can’t find anything before 1992. Some of that is because individual articles aren’t being stored on web sites anymore, and the few mailing list archives don’t have dates that Google recognizes. And I thought maybe that was the case for Usenet as well, but nope. It’s been removed from the internet because of some asshole apparently went after them with lawyers to get something redacted. I used to be able to search for things I wrote back in the early 80’s. But no longer.
That’s painful. An important part of internet history erased. (I know, people have private copies of the archive, I even know some of them, but that’s not the same).
For what it’s worth. This is what I got from Google. And there’s more details on the UtZoo Usenet archive at the bottom of this post. There's no blog posts of mine here because they are all offline right now, but I'll fix that soon. Those will go back to 1997.
The weighting here is very biased towards commercial walled gardens. It's clearly no longer based on references from other sites, or my Pseudonym megapost would be much higher. It's based on status of web sites, not content. It's biased *against* content.
1. LinkedIn
2. Instagram
3. Academia.edu
4. Flickr (haven't posted anything there in years)
5. YouTube (ditto)
6. www.Pinterest (*very* ditto, not sure I've ever posted anything there)
7. Quora (ditto)
8. Apollo.io (scraped from LinkedIn, well done Google)
<break for some images, all actually mine>
9. Usenix.org (paper I'm listed as a co-author on)
10. Goodreads
11. GitHub
12. Facebook
13. Foursquare (ancient)
14. W3C.org (mailing list post from 1996...the first hit that I'd consider old-style internet content)
15. Gawker (article about a blog post I made about a Sarah Lacy interview with Zuck a long time ago—I mapped twitter sentiment to the video to the interview)
16. ThreadReaderApp (some of my twitter threads)
17. Palmer House Inn (article about Sandy Neck Lighthouse that mentions me)
18. Infosec Exchange (finally, Mastodon, my most active social media)
19. opensource.apple.com (some code I wrote a very long time ago)
20. tr.pinterest (WTF google? Again?)
21. Tribute Archive (my aunt's obit)
22. PCWorld on abcnews.go.com (mention of a blog post I wrote analyzing the Google Orkut worm--remember Orkut?)
23. Portland Press Herald (my aunt's obit again, sigh)
24. blogs.gnome.org (kurt von finck's blog referencing a tiny blog post I made about being in Maine)
25. perl.apache.org (changelog for Embperl mentioning a bug I reported)
26. Stack Overflow (my home page, again, old)
27. ScienceDirect (description of a paper I wrote for INTERACT '87)
28. support.google.com (support question)
29. Ad, offering to search about info about me in Maine (presumably because that's my current location)
30. cohost.org (post, summary oddly pulls in the last sentence of my bio, which is mentions my daughter)
31. spaf.cerias.purdue.edu (Yucks Digest V2, a (true story) joke I posted to rec.humor.funny (Hi @spaf)
32. Birdeye.com, a review of their dog doors four years ago.
33. unice.fr (a copy of the emacs bindings I made for Mac text areas)
34. Forbes.com (a comment on an article about Dragon Systems, with the wrong summary)
35. IRTF Anti-spam Research Group thread (another mailing list archive)
36. UCLA (reference to the web version of Phil Agre’s Red Rock Eater Digest that I maintained through 2004)
37. A reshared mastodon post about XYZ on DTSS
38. Another mastodon reshare
39. NetBSD (same software Apple had)
40. More Mastodon (this time my pixelfed account)
41. Playstation.net (copyright for same software again; in BSD libc)
42. perl.org (a mailing list post)
43. justia.com (a patent, the rest show up eventually elsewhere, very random)
44. tronche.com (Inter-Client Communication Conventions Manual for X Version 11, R6. Thanks for being in the public review)
After that, it's basically Mastodon posts and occasional mailing lists, and some references to my megapost on Pseudonyms. I used to be able to find Usenet stuff using a date limit to the 80's, but not anymore. If I date limit, I find the earliest content is 1992 (A Google Groups post, a mention in the Motif Programming Manual ("just because he's cool" 🤣)), and more copies of the ICCCM manual.
Searching for my name and “usenet” gets a usenet search engine, which does not appear to be working. http://benschmidt.org/usenet/, the reason becomes clear…
Looking at archive.org/usenet, I find the quote below. As of 2020, they are offline. WTF?
> This is not a collection of the UTZOO Wiseman Usenet Archive.
>
> In 2020 after sustained legal demands requesting a set of messages within the Usenet Archive be redacted, and to avoid further costs and accusations of manipulation should those demands be met, the archive has been removed from this URL and is not currently accessible to the public.
>
> Included in this item is a file listing and the md5 sums of the removed files, for the use of others in verifying they have original materials.
No wonder it's not in search anymore. What the fuck.
If I search for "apollo!nazgul", I only find 7 results.
A decade of my life, of many people's lives, got erased from the internet.
#ComputerHistory #USENET #Search #Research
We all know Google Search has seriously degraded, with tons of duplicate and garbage content from content farms (which I’m sure carry lots of Google ads, so perhaps they don’t care—we're not the customer). But also, searching for my own name (which is globally unique) no longer returns nearly as much as it used to. It used to have hundreds if not thousands of hits to various mailing lists archives, not to mention old Usenet posts, and everything I've written online since.
So for fun, I did a search and ran down the list. Basically, after about 44 results, it’s just a mix of Mastodon posts (often reshares, and not including my profile), an occasional random mailing list post, and references to my megapost on Pseudonyms from the Google+ nymwars.
But here’s the shocker.
The Usenet results are gone.
When I set the date restrictions on my name search, I can’t find anything before 1992. Some of that is because individual articles aren’t being stored on web sites anymore, and the few mailing list archives don’t have dates that Google recognizes. And I thought maybe that was the case for Usenet as well, but nope. It’s been removed from the internet because of some asshole apparently went after them with lawyers to get something redacted. I used to be able to search for things I wrote back in the early 80’s. But no longer.
That’s painful. An important part of internet history erased. (I know, people have private copies of the archive, I even know some of them, but that’s not the same).
For what it’s worth. This is what I got from Google. And there’s more details on the UtZoo Usenet archive at the bottom of this post. There's no blog posts of mine here because they are all offline right now, but I'll fix that soon. Those will go back to 1997.
The weighting here is very biased towards commercial walled gardens. It's clearly no longer based on references from other sites, or my Pseudonym megapost would be much higher. It's based on status of web sites, not content. It's biased *against* content.
1. LinkedIn
2. Instagram
3. Academia.edu
4. Flickr (haven't posted anything there in years)
5. YouTube (ditto)
6. www.Pinterest (*very* ditto, not sure I've ever posted anything there)
7. Quora (ditto)
8. Apollo.io (scraped from LinkedIn, well done Google)
<break for some images, all actually mine>
9. Usenix.org (paper I'm listed as a co-author on)
10. Goodreads
11. GitHub
12. Facebook
13. Foursquare (ancient)
14. W3C.org (mailing list post from 1996...the first hit that I'd consider old-style internet content)
15. Gawker (article about a blog post I made about a Sarah Lacy interview with Zuck a long time ago—I mapped twitter sentiment to the video to the interview)
16. ThreadReaderApp (some of my twitter threads)
17. Palmer House Inn (article about Sandy Neck Lighthouse that mentions me)
18. Infosec Exchange (finally, Mastodon, my most active social media)
19. opensource.apple.com (some code I wrote a very long time ago)
20. tr.pinterest (WTF google? Again?)
21. Tribute Archive (my aunt's obit)
22. PCWorld on abcnews.go.com (mention of a blog post I wrote analyzing the Google Orkut worm--remember Orkut?)
23. Portland Press Herald (my aunt's obit again, sigh)
24. blogs.gnome.org (kurt von finck's blog referencing a tiny blog post I made about being in Maine)
25. perl.apache.org (changelog for Embperl mentioning a bug I reported)
26. Stack Overflow (my home page, again, old)
27. ScienceDirect (description of a paper I wrote for INTERACT '87)
28. support.google.com (support question)
29. Ad, offering to search about info about me in Maine (presumably because that's my current location)
30. cohost.org (post, summary oddly pulls in the last sentence of my bio, which is mentions my daughter)
31. spaf.cerias.purdue.edu (Yucks Digest V2, a (true story) joke I posted to rec.humor.funny (Hi @spaf)
32. Birdeye.com, a review of their dog doors four years ago.
33. unice.fr (a copy of the emacs bindings I made for Mac text areas)
34. Forbes.com (a comment on an article about Dragon Systems, with the wrong summary)
35. IRTF Anti-spam Research Group thread (another mailing list archive)
36. UCLA (reference to the web version of Phil Agre’s Red Rock Eater Digest that I maintained through 2004)
37. A reshared mastodon post about XYZ on DTSS
38. Another mastodon reshare
39. NetBSD (same software Apple had)
40. More Mastodon (this time my pixelfed account)
41. Playstation.net (copyright for same software again; in BSD libc)
42. perl.org (a mailing list post)
43. justia.com (a patent, the rest show up eventually elsewhere, very random)
44. tronche.com (Inter-Client Communication Conventions Manual for X Version 11, R6. Thanks for being in the public review)
After that, it's basically Mastodon posts and occasional mailing lists, and some references to my megapost on Pseudonyms. I used to be able to find Usenet stuff using a date limit to the 80's, but not anymore. If I date limit, I find the earliest content is 1992 (A Google Groups post, a mention in the Motif Programming Manual ("just because he's cool" 🤣)), and more copies of the ICCCM manual.
Searching for my name and “usenet” gets a usenet search engine, which does not appear to be working. http://benschmidt.org/usenet/, the reason becomes clear…
Looking at archive.org/usenet, I find the quote below. As of 2020, they are offline. WTF?
> This is not a collection of the UTZOO Wiseman Usenet Archive.
>
> In 2020 after sustained legal demands requesting a set of messages within the Usenet Archive be redacted, and to avoid further costs and accusations of manipulation should those demands be met, the archive has been removed from this URL and is not currently accessible to the public.
>
> Included in this item is a file listing and the md5 sums of the removed files, for the use of others in verifying they have original materials.
No wonder it's not in search anymore. What the fuck.
If I search for "apollo!nazgul", I only find 7 results.
A decade of my life, of many people's lives, got erased from the internet.
#ComputerHistory #USENET #Search #Research