Jump to content
Sign In to follow this  
Duke87

Robots.txt and the Wayback Machine...

5 posts in this topic Last Reply

Highlighted Posts

Posted:
Last Online:  
 

Now, I know that nothing from the past couple of years is archived on the Wayback Machine because the site started blocking bots from accessing it....

...but now it would seem as though the older pages which were archived have retroactively been blocked out:

robotstxtexclusion.png

"robots.txt", eh? Well, such a file does indeed exist here. It says the following as of this posting:

User-agent: Googlebot

Disallow:

User-agent: googlebot-image

Disallow:

User-agent: googlebot-mobile

Disallow:

User-agent: *

Disallow: / quote>

That last item with the asterisk (which, if I understand correctly, serves to disallow anything and everything) seems to be the sticky point here. Wayback probably modified their policy such that that not only disallows their bots from making new archives but now also denies access to the old ones.

Was anyone aware of this? Is there anything that can be done about it?


If you always take the same road, you will never see anything new.
If you can read this, you deserve a cookie.

Share this post


Link to post
Share on other sites
Posted:
Last Online:  
 

Kindly ask Dirk to add a line called:

User-Agent: Wayback Machine

Allow:

Regards,

Korot

Share this post


Link to post
Share on other sites
  • Original Poster
  • Posted:
    Last Online:  
     

    Originally posted by: Korot

    Kindly ask Dirk to add a line called:

    User-Agent: Wayback Machine

    Allow:quote>

    According to them, the agent is called "ia_archiver", not "Wayback Machnie".


    If you always take the same road, you will never see anything new.
    If you can read this, you deserve a cookie.

    Share this post


    Link to post
    Share on other sites
    Posted:
    Last Online:  
     

    Does that block the Google bots that take caches of the pages? (in case of a hack.... again)

    So, I checked it out, seems like it isn't blocked and the caches were fairly new (as in taken 2 minutes ago).


    This signature does not exist. Continue on.

    Share this post


    Link to post
    Share on other sites

    Sign In or register to comment...

    To comment in reply, you must be a community member

    Sign In  

    Already have an account? Sign in here.

    Sign In Now

    Create an Account  

    Sign up to join our friendly community. It's easy!  

    Register a New Account

    Sign In to follow this  

    • Recently Browsing   0 members

      No registered users viewing this page.

    ×

    Thank You for the Continued Support!

    Simtropolis depends on donations to fund site maintenance costs.
    Without your support, we just would not be in our 24th year online!  You really help make this a great community. *:thumb:

    But we still need your support to stay online. If you're able to, please consider a donation to help us stay up and running. This helps sustain a platform where we can share our community creations for years to come.

    Make a Donation, Get a Gift!

    Expand your city with the best from the Simtropolis Exchange.
    Make a Donation and get one or all three discs today!

    STEX Collections

    By way of a "Thank You" gift, we'd like to send you our STEX Collector's DVD. It's some of the best buildings, lots, maps and mods collected for you over the years. Check out the STEX Collections for more info.

    Each donation helps keep Simtropolis online, open and free!

    Thank you for reading and enjoy the site!

    More About STEX Collections