News:

Am able to again make updates to the Shield Gallery!
- Alex

Main Menu

AAroads forum archive?

Started by bugo, June 17, 2015, 07:55:29 PM

Previous topic - Next topic

bugo

I've often pondered the wealth of information that has been shared on this forum since it began in 1972. This forum is a treasure trove of knowledge and some of it has only been publicly shared on this forum. I've also wondered what will happen when this forum gets shut down. I'm not saying the forum is in danger of being shut down, but all things have a beginning and an end. It might be tomorrow and it might be in the year 2121. My point is that there should be several places the information contained in this forum can be archived just in case the forum gets shut down for whatever reason. I'm thinking of the future, say 50 years from now. If this forum goes away, then all that knowledge is lost. There needs to be a way to preserve the information in the forum just in case. Imagine how cool it would be if this site had been around 50 years ago. A lot of questions that we don't have the answers for would be contained in the forum's archive. In 50 years I want future road enthusiasts to be able to access this information. The highway system today is well documented on this forum and should be preserved for future generations. I've personally done lots of research on different topics and posted the results here and nowhere else. If that information is lost to future generations, it would be a tragedy. What I'm basically saying is that this site should be backed up or mirrored. Is there an easy way (a software package) to archive this information? That way if the owners of the forum decide to shut the forum down abruptly there would be mirror sites with all the information. I think of MTR and all the knowledge that has been lost because of Google's piss poor USENET archive. I would hate for all the work and research that have been done and posted in this forum to disappear forever. So who wants to archive the site (as long as they got permission from Alex, and I know him well and he wouldn't have a problem with it)?


US71

Good idea! Thanks for volunteering ;)  :coffee: :thumbsup:
Like Alice I Try To Believe Three Impossible Things Before Breakfast

bugo

I wouldn't even know where to start.

iBallasticwolf2

I love the idea of an archive. It would give great info for generations to come. Imagine if say a new type of highway system was built that was superior to interstate highways and interstate highways were completely ripped with barely any traces. Future generations would want to know about "Interstate highways" and without a archive of info for it they probably won't be able to get info on them.
Only two things are infinite in this world, stupidity, and I-75 construction

Zeffy

SQL backups preserve the integrity of the forum here, and with a clean SMF install you can basically make any installation into the backup. Then you can take the SQL and do what you want with it.
Life would be boring if we didn't take an offramp every once in a while

A weird combination of a weather geek, roadgeek, car enthusiast and furry mixed with many anxiety related disorders

bugo

Quote from: iBallasticwolf2 on June 17, 2015, 08:27:16 PM
I love the idea of an archive. It would give great info for generations to come. Imagine if say a new type of highway system was built that was superior to interstate highways and interstate highways were completely ripped with barely any traces. Future generations would want to know about "Interstate highways" and without a archive of info for it they probably won't be able to get info on them.

Yes. You get it. That's exactly why I think there should be an archive.

corco

Quote from: bugo on June 17, 2015, 10:05:39 PM
Quote from: Zeffy on June 17, 2015, 08:51:42 PM
SQL backups preserve the integrity of the forum here, and with a clean SMF install you can basically make any installation into the backup. Then you can take the SQL and do what you want with it.



You don't get it (big surprise there). I'm talking about an offsite archive. Or twelve. The more redundancy the better. If you do a backup on the same site and that site is compromised, the archives will be lost as well. C'mon, you can't be this dense.

Uh, the backup file would be stored off site....

Molandfreak

Quote from: corco on June 17, 2015, 10:11:55 PM
Quote from: bugo on June 17, 2015, 10:05:39 PM
Quote from: Zeffy on June 17, 2015, 08:51:42 PM
SQL backups preserve the integrity of the forum here, and with a clean SMF install you can basically make any installation into the backup. Then you can take the SQL and do what you want with it.
You don't get it (big surprise there). I'm talking about an offsite archive. Or twelve. The more redundancy the better. If you do a backup on the same site and that site is compromised, the archives will be lost as well. C'mon, you can't be this dense.
Uh, the backup file would be stored off site....
Would it be accessible to anyone, though?
Quote from: Max Rockatansky on December 05, 2023, 08:24:57 PMAASHTO attributes 28.5% of highway inventory shrink to bad road fan social media posts.

corco

Quote from: Molandfreak on June 18, 2015, 12:26:29 AM
Quote from: corco on June 17, 2015, 10:11:55 PM
Quote from: bugo on June 17, 2015, 10:05:39 PM
Quote from: Zeffy on June 17, 2015, 08:51:42 PM
SQL backups preserve the integrity of the forum here, and with a clean SMF install you can basically make any installation into the backup. Then you can take the SQL and do what you want with it.
You don't get it (big surprise there). I'm talking about an offsite archive. Or twelve. The more redundancy the better. If you do a backup on the same site and that site is compromised, the archives will be lost as well. C'mon, you can't be this dense.
Uh, the backup file would be stored off site....
Would it be accessible to anyone, though?

If it were put on another site

Zeffy

Quote from: corco on June 18, 2015, 12:29:26 AM
Quote from: Molandfreak on June 18, 2015, 12:26:29 AM
Would it be accessible to anyone, though?

If it were put on another site

And that was my intent in my original post. You back it up on a set period to your own computer, and, for double protection, a recovery drive. As long as SQL exists in any point in the future, the data will still be there. From there you can upload it to another site. There are plenty of ways to keep this stuff available for years to come. If Alex wanted me to, I would easily do it too.
Life would be boring if we didn't take an offramp every once in a while

A weird combination of a weather geek, roadgeek, car enthusiast and furry mixed with many anxiety related disorders

corco

#10
Quote from: Molandfreak on June 18, 2015, 12:53:05 AM
Quote from: Zeffy on June 18, 2015, 12:35:21 AM
Quote from: corco on June 18, 2015, 12:29:26 AM
Quote from: Molandfreak on June 18, 2015, 12:26:29 AM
Would it be accessible to anyone, though?
If it were put on another site
And that was my intent in my original post. You back it up on a set period to your own computer, and, for double protection, a recovery drive. As long as SQL exists in any point in the future, the data will still be there. From there you can upload it to another site. There are plenty of ways to keep this stuff available for years to come. If Alex wanted me to, I would easily do it too.
But your hard drive would have to be able to store terabytes of data to get it all. Can over 372,000 posts really fit on a reasonably-sized hard drive?

Really? Somehow every single post magically fits on whatever server is hosted here (in addition to the massive content hosted on AARoads), and I'd sure hope Alex isn't paying to host terabytes worth of data. Since all the images contained within forum posts are hosted off-site, it's all text. I bet all 372,000 posts don't make up more than 10-15 gigs.

andy3175

Yes, I agree that an archive of old posts is something we should investigate. Many older posts have gotten disconnected from the in-line images. It is true, the total usage of the Forum portion of the site is not as huge as it could be since the AARoads is not hosting Forum related pictures. But that is something Alex and I have discussed previously and something we will continue to deliberate as the Forum gets larger and more history is stored on the server.

Thanks Jeremy for your volunteerism on this; as usual I appreciate it!
Regards,
Andy

www.aaroads.com

Zeffy

Quote from: corco on June 18, 2015, 12:57:40 AM
Really? Somehow every single post magically fits on whatever server is hosted here (in addition to the massive content hosted on AARoads), and I'd sure hope Alex isn't paying to host terabytes worth of data. Since all the images contained within forum posts are hosted off-site, it's all text. I bet all 372,000 posts don't make up more than 10-15 gigs.

Yes, exactly! Databases actually aren't as big as people think in terms of file-size, which is why they are so efficient. All of the PHP content on the main AARoads' site, for example, is only a whopping 40 MB in terms of a .SQL file. (I'm talking about the content in the database that is used to display the pages, of course, not the actual PHP files that you use to access the content) Sure, this forum has a lot of posts and whatnot, but I honestly can't see an SQL backup of the forum being more than maybe 500 MBs tops.
Life would be boring if we didn't take an offramp every once in a while

A weird combination of a weather geek, roadgeek, car enthusiast and furry mixed with many anxiety related disorders

bugo

There should be (and should have been all along) preservation of pictures posted to the forum (except for Brian556's destruction of that poor toilet) because I'll read a post from, say, 2011 and the pictures will be gone. This would take up a lot of disc space but hard drives are getting cheaper and will probably eventually be replaced by something like SD card technology.

iBallasticwolf2

Quote from: bugo on June 18, 2015, 10:10:39 AM
There should be (and should have been all along) preservation of pictures posted to the forum (except for Brian556's destruction of that poor toilet) because I'll read a post from, say, 2011 and the pictures will be gone. This would take up a lot of disc space but hard drives are getting cheaper and will probably eventually be replaced by something like SD card technology.

Or SSD's which are becoming more popular as well as flash.
Only two things are infinite in this world, stupidity, and I-75 construction

Bickendan

When you say archive, do you mean something akin to the m.t.r faq?

Zeffy

Quote from: bugo on June 18, 2015, 10:10:39 AM
There should be (and should have been all along) preservation of pictures posted to the forum (except for Brian556's destruction of that poor toilet) because I'll read a post from, say, 2011 and the pictures will be gone. This would take up a lot of disc space but hard drives are getting cheaper and will probably eventually be replaced by something like SD card technology.

Andy and Alex both mentioned allowing members to directly upload their pictures to the forum, and perhaps even their own photo gallery (akin to Flickr, etc) so that their images would never be lost so as long as AARoads lived, which I think is a great idea. The reason it isn't like that is security concerns, but new SMF versions are making it harder for malicious files to be uploaded.
Life would be boring if we didn't take an offramp every once in a while

A weird combination of a weather geek, roadgeek, car enthusiast and furry mixed with many anxiety related disorders

Brandon

Quote from: bugo on June 17, 2015, 07:55:29 PM
I've often pondered the wealth of information that has been shared on this forum since it began in 1972.

1972?  This forum didn't even start until after 2000, IIRC.  Hell, if it was 1972, I'd have loved to have used it back in college instead of just mtr.
"If you think this has a happy ending, you haven't been paying attention." - Ramsay Bolton, "Game of Thrones"

"Symbolic of his struggle against reality." - Reg, "Monty Python's Life of Brian"

hotdogPi

Quote from: Brandon on June 18, 2015, 01:20:49 PM
Quote from: bugo on June 17, 2015, 07:55:29 PM
I've often pondered the wealth of information that has been shared on this forum since it began in 1972.

1972?  This forum didn't even start until after 2000, IIRC.  Hell, if it was 1972, I'd have loved to have used it back in college instead of just mtr.

2009.
Clinched

Traveled, plus
US 13, 50
MA 22, 35, 40, 53, 79, 107, 109, 126, 138, 141, 159
NH 27, 78, 111A(E); CA 90; NY 366; GA 42, 140; FL A1A, 7; CT 32, 320; VT 2A, 5A; PA 3, 51, 60, WA 202; QC 162, 165, 263; 🇬🇧A100, A3211, A3213, A3215, A4222; 🇫🇷95 D316

Lowest untraveled: 36

Scott5114

Quote from: Brandon on June 18, 2015, 01:20:49 PM
Quote from: bugo on June 17, 2015, 07:55:29 PM
I've often pondered the wealth of information that has been shared on this forum since it began in 1972.

1972?  This forum didn't even start until after 2000, IIRC.  Hell, if it was 1972, I'd have loved to have used it back in college instead of just mtr.
Later than that, even. The forum part of the site didn't start until around 2008, and the current incarnation of the board (i.e. what you can actually access) dates to 2009.
uncontrollable freak sardine salad chef

J N Winkler

Quote from: Zeffy on June 18, 2015, 09:44:08 AM
Quote from: corco on June 18, 2015, 12:57:40 AMReally? Somehow every single post magically fits on whatever server is hosted here (in addition to the massive content hosted on AARoads), and I'd sure hope Alex isn't paying to host terabytes worth of data. Since all the images contained within forum posts are hosted off-site, it's all text. I bet all 372,000 posts don't make up more than 10-15 gigs.

Yes, exactly! Databases actually aren't as big as people think in terms of file-size, which is why they are so efficient. All of the PHP content on the main AARoads' site, for example, is only a whopping 40 MB in terms of a .SQL file. (I'm talking about the content in the database that is used to display the pages, of course, not the actual PHP files that you use to access the content) Sure, this forum has a lot of posts and whatnot, but I honestly can't see an SQL backup of the forum being more than maybe 500 MB tops.

The forum has a post size limit of 20 KB.  This gives an outer size limit of 7.4 GB for the table that has post content, which will be by far the largest in the database.  The actual size is probably much smaller because very few posts even approach the 20 KB limit.  The real challenge would be archiving the images as well, but this forum is not really image-heavy (unlike, say, the Highways & Autobahns section of SkyscraperCity) and quite stringent pixel count limits are enforced, so I would be very surprised if the total size of a backup, including text and images hosted offsite, came to more than 20 GB.
"It is necessary to spend a hundred lire now to save a thousand lire later."--Piero Puricelli, explaining the need for a first-class road system to Benito Mussolini

vtk

I don't mind if someone wants to archive images posted to http://vidthekid.info/imghost/ which are primarily used in my forum posts. I don't plan on removing any of them or discontinuing the site, but shit happens, so...
Wait, it's all Ohio? Always has been.

bugo

Quote from: Bickendan on June 18, 2015, 11:05:05 AM
When you say archive, do you mean something akin to the m.t.r faq?

Not at all. I'm referring to multiple backups of every post on the forum.

Dougtone

I would have to wonder if Forum information is stored on the Wayback or archive.org or a similar archival website.

SCH-I545


J N Winkler

Quote from: Dougtone on June 19, 2015, 09:29:49 AMI would have to wonder if Forum information is stored on the Wayback or archive.org or a similar archival website.

I've just had a look in the Wayback Machine.  The top page is archived, but none of the dynamically generated pages appear to be stored or retrievable.  Assignment of a PHP session ID seems to be what is breaking things.

This loads:

https://web.archive.org/web/20140824130645/https://www.aaroads.com/forum/index.php

But this does not:

https://web.archive.org/web/20140824130645/https://www.aaroads.com/forum/index.php?PHPSESSID=e3c32f5d3752144cb095ae5e5db77a81&board=22.0

AFAICT, the PHP session ID is not part of URLs generated by the live forum and displayed in the browser bar.  It is just something that appears in Web Archive URLs and I suspect it has something to do with their retrieval code, not anything on the AARoads end.

I certainly wouldn't count on the Web Archive to save us if the forum ever gets hosed.  One would have much better luck just writing a wget wrapper script to crawl the forum and take a copy of every single thread page together with all page prerequisites like embedded images and so on, though this won't archive anything that is not visible on a general user account.  The much better solution is to generate a redundant backup of the database tables themselves using admin access.
"It is necessary to spend a hundred lire now to save a thousand lire later."--Piero Puricelli, explaining the need for a first-class road system to Benito Mussolini



Opinions expressed here on belong solely to the poster and do not represent or reflect the opinions or beliefs of AARoads, its creators and/or associates.