Wild card URL search thing at archive org’s WayBack Machine?

User avatar
DocAElstein
5StarLounger
Posts: 651
Joined: 18 Jan 2022, 15:59
Location: Re-routing rivers, in Hof, Beautiful Bavaria

Wild card URL search thing at archive org’s WayBack Machine?

Post by DocAElstein »

Hi,
A few months back, while I was spending some time looking for stuff at archive org, I thought I had stumbled by brute force on some hack when I used some sort of "wild card URL link" like this
https://web.archive.org/web/*/http://wl.dlservice.microsoft.com/download*
Delving into things there I sometimes read accompanying notes to some downloads saying things like, "….not available to the public…." or something like "….posted by the Jasmin Tunisian Revolution party …."
I thought I stumbled on some secret internal archive link, especially as some of the downloads which were masquerading as legitimate Microsoft .exe downloads, actually had in addition to what they should have, some naughty stuff in them as well.

But I saw something similar from Hans here
https://web.archive.org/web/*;type=text/wopr.com/*

Can anyone enlighten me on what those "wild card URL link" things are?
Is it a archive org thing or just some standard URL wildcard thing. Is there any documentation or blogs on that, ( or generally on how to search things at archive org ) ?
For example, what is ;type=text about ?

Thanks
Alan
I seriously don’t ever try to annoy. Maybe I am just the kid that missed being told about the King’s new magic suit, :(

User avatar
HansV
Administrator
Posts: 78977
Joined: 16 Jan 2010, 00:14
Status: Microsoft MVP
Location: Wageningen, The Netherlands

Re: Wild card URL search thing at archive org’s WayBack Machine?

Post by HansV »

See Search – A Basic Guide. This i specific to the Wayback Machine - there is no universal search protocol that applies to all websites.
Best wishes,
Hans

User avatar
DocAElstein
5StarLounger
Posts: 651
Joined: 18 Jan 2022, 15:59
Location: Re-routing rivers, in Hof, Beautiful Bavaria

Re: Wild card URL search thing at archive org’s WayBack Machine?

Post by DocAElstein »

Thanks, I took a good look, that helps clear a few things up for me….._
_....... that link is actually a general bit of help documentation to archive org, which is nice, as it shows that archive org is actually a general archive place, open to the public, - it just happens to use the internet rather than a real life building with a lot of shelves in it.( For example, my ex Mother in Law could join and upload her Christmas stollen cake recipe, if she felt like sharing it, but she probably wouldn’t, as apart from that and looking pretty, she was no good at anything else, so I expect she would want to hold on to the secret).
archive org also goes sometimes by the general name when talking about it of Internet Archive
It has E-books for example, but the general idea of that is close to your real life public book library building with shelves . books, and that obligatory little guy in the black suit from the 1950’s telling you to be quiet

Like a few people, I often got confused with archive org and web archive org.
For some time I was confused when someone offered me a download to a file, and the link either started with like
_ web archive
, or just
_ archive

The web bit is just part of archive org (Internet Archive) and incudes mainly the wayback machine.

To clarify: you could find a download for some .exe file either
_ from the web archive org - which theoretically would be from the capture of a site that was/ had been, offering it, ( although some of my experience suggests someone has found a way to hack into some of those offered files, or somehow fool archive org into storing a site page you faked from a real one that has not been available for years, )
, or
_ you could find a few offers of the same .exe file from archive, which most probably would not have been uploaded to share from my ex Mother in Law, but it could have been , or from anyone else – there is no real control, although you could make a comment there and tell people you think it is bad

_.____

There is a sub section there, at that link , Search – A Basic Guide , titled Wayback Machine Search , which shows how to pick out a particular capture for a site for which they have many dated captures from . That’s useful to know about, as it’s not very intuitive and I often have/ had to explain that to people. (It might be a bit out of date though, as things look a bit different, but it gives some of the general idea)

Right at the end of that Wayback Machine Search sub section is another link Wayback Machine search. . That page has a video at the start which is a bit more detailed. I thought I would make a quick copy of that for us, just to archive a bit of archive, stuff
Wayback Machine Intro.mp4 : https://app.box.com/s/e7vzhii3vr102r84sznrhseibufuz7zh
Wayback Machine Intro_wmv2.wmv : https://drive.google.com/file/d/1vzgSWj ... =drive_web

(That video says nought about the "wild card URL link" stuff ). The video is done by a woman I seen in a few place talking about a few things – she is a Librarian in real life, so that makes sense.


Somewhere down in the text at Wayback Machine search. , it says this…
The best way to see all the files we have archived of the site is:
http://web.archive.org/*/www.yoursite.com/* …………
So that ties up with mine and yours
https://web.archive.org/web/*/http://wl.dlservice.microsoft.com/download*
https://web.archive.org/web/*;type=text/wopr.com/*
( I still don’t know what that ;type=text is in yours? )


A bit further down in the text is also ..
…..You can see a listing of the dates of the specific URL by replacing the date code with an asterisk (*), ie: :
http://web.archive.org/*/www.yoursite.com ……….
I seen that sort of thing as well, in things like this , which I had also assumed wrongly was another hack I had tripped over, https://web.archive.org/web/*/http://wl.dlservice.microsoft.com/download/E/3/E/E3EEC6D6-1141-40C4-840F-770F99B67986/en/wlsetup-all.exe
https://web.archive.org/web/20240000000000*/http://wl.dlservice.microsoft.com/download/E/3/E/E3EEC6D6-1141-40C4-840F-770F99B67986/en/wlsetup-all.exe

That is mentioned, sort of, in the video, but not in terms of the wild thing stuff, - its talked about as the Calendar option

_.________

In fact, you can see these "wild card things" appear in the background , or rather appear in the URL bar when you do things:
Example:
_ Go to the archive org (Internet Archive) web Wayback machine place: https://web.archive.org/web/ ( https://i.postimg.cc/gcyMRt6t/archive-o ... -place.jpg )
_ Type in something like https://eileenslounge.com/ ( https://i.postimg.cc/zvk0r3rY/Type-in-s ... ge-com.jpg )
_ Hit Enter ( If the URLs is not selected, then select it – it’s usually the default ( https://i.postimg.cc/zXJxkwCz/Hit-Enter-select-URLs.jpg )
_ Now take a look at the URL link showing in your browser - it turns into the first sort of "wild card thing" that we have been discussing
https://web.archive.org/web/*/https://eileenslounge.com/*

_ Now click on Calendar ( https://i.postimg.cc/DZLxxBpZ/click-on-Calendar.jpg )
_ You will now see the another sort of "wild card thing" in your browser
https://web.archive.org/web/20240000000000*/https://eileenslounge.com/

_.____________

So all good stuff, I am a bit wiser, now, - things are falling into place, Thanks

_.__________

( I don’t see anywhere yet any mention of the top right word(s) filter box, that I mentioned in a related post today. I noticed that a few months ago, and only just today I noticed that you can have multiple words, separated by a space – one of those little bonuses that often occurs when I am preparing a beautiful post, :)
I best patent that quick to earn a few Euros when anyone uses it, :) )


Alan
I seriously don’t ever try to annoy. Maybe I am just the kid that missed being told about the King’s new magic suit, :(