AARoads Forum

Non-Road Boards => Off-Topic => Topic started by: bugo on August 09, 2023, 08:58:21 PM

Title: Data hoarders
Post by: bugo on August 09, 2023, 08:58:21 PM
I am a data hoarder. I have tons of pictures, text files. audio files and other documents saved on my hard drives. Many of these documents are maps and other road-related documents such as meeting minutes, route logs and orders.

When a state highway department posts an updated map on their website, the old map is almost always removed. And when they're removed, you usually can't get them anywhere else unless somebody has saved it and sends it to you. That's why data hoarding is important.

We won't even start on mapping programs like Google Maps that are constantly being updated. It's easy to buy a 2010 paper road atlas, but with mapping software, there are no old editions to archive. This will make it more difficult for future road researchers to try to pin the date of the opening of a road by using maps.

Does anybody else obsessively archive maps and other road-related documents?
Title: Re: Data hoarders
Post by: oscar on August 09, 2023, 09:46:12 PM
Not "obsessive" exactly for me -- more like being a lazy pack rat. But in addition to old road atlases, I keep around (but have not installed on my computer) old versions of Microsloth Streets and Trips. The latest one is from 2013, but I still use it sometimes.

Also, I have old Alaska DOT&PF route logs for the few numbered and signed highways it maintains. The DOT says they're obsolete, but last I checked (long ago) new versions have not been posted online. The new versions are less detailed, and the lost details are still somewhat useful for maintaining Travel Mapping's Alaska route files.

I copied a lot of old Hawaii maps and documents, too. The most interesting instance where that came in handy was a copy of the map showing the temporary highway network on Oahu (Honolulu) set up ahead of World War II, which fell into disuse when the war was over. The original map I copied was stored in a basement, which later was destroyed in a flood. I web-posted my copy of the map, so it's still available for reference.

Title: Re: Data hoarders
Post by: Bruce on August 09, 2023, 09:52:49 PM
While I do keep some offline collections, especially for media that I know won't be saved in a proper fashion, there are places dedicated to archiving online stuff. The Internet Archive (https://archive.org/) operates the Wayback Machine (https://en.wikipedia.org/wiki/Wayback_Machine), which has been collecting cached webpages for longer than I have been alive and has trawled many DOT webpages and roadgeek sites. It's incredibly useful if you are able to find a link and timeframe from another mention (e.g. newspapers or internet forums) and can plug it in.

I just hope there's better archiving practices in place for DOTs in this digital era. A few have been consistent with their archiving of online documents (such as Oregon, who forwards their DOT work to the state library's online collections), but many haven't set things up properly.
Title: Re: Data hoarders
Post by: Max Rockatansky on August 09, 2023, 09:53:57 PM
I found hoarding data to be burdensome over the years.  It tended to grind on me more and more as time goes on, enough to the point that I found letting go freeing.  This would be probably the most significant reason I don't have stuff like a Travel Maps or Mob Rule account. 

Now, I did get my photos organized back in 2016.  It was about a four month effort to complete but I somehow managed to break it up into health chunks of time.  I can't imagine would be able to replicate the feat now that I'm married, my wife would be upset at that empty use of hours.

I do have an almost full collection of California Highways & Public Works which were denoted to me due to Gribblenation.  I more or less hung onto a lot of maps (especially NPS) by accidents given I like to occasionally look at them.
Title: Re: Data hoarders
Post by: formulanone on August 09, 2023, 10:10:18 PM
In similar vein, lots of my photos. I take about 25,000-40,000 pics in a year, and save everything to three portable hard drives, in a bit of a grandfather-father-son method. (It's honestly more like Stay-at-home-Father, Son, and the Wayward Son...)

I'd save every file that wasn't a completely indistinguishable image...just in case. And for each image I modified, I'd save the full-sized-modified images and the smaller online version. About six months ago, I stopped saving the full-size "updated" images and just went with the smaller ones, to save time in the process of creation and backing it all up.

It took getting a new camera to finally get me to start culling through the old stuff. (After all, I don't yet feel like buying another new portable hard drive and spending a week to back that all up...again.) These images are now usually 2-3 times the file size from my previous camera, so do I really need 15 similar shots of the same road? Those test shots in the parking lot in the morning? Why did I think those images in total darkness were going to be masterful and sublime? So cleared out about 40-50 GB in stuff like that.

The problem is that I usually don't have as much time right away, and just sock them away in the folder until several months (or even years) later. But I'm getting better at being a little more picky and not worrying if IMG_2345 is almost identical to IMG_2354 and neurotically keeping all twelve of them, and not feeling like the task of preparing them for uploading is an insurmountable one.
Title: Re: Data hoarders
Post by: Bruce on August 09, 2023, 10:20:03 PM
I've got thousands of photos I need to process (aka basic editing and tagging) and upload to Wikimedia Commons before it's too late. I have backups but most are on-site.
Title: Re: Data hoarders
Post by: Rothman on August 09, 2023, 10:52:50 PM
I am old enough now to start wondering where everything will go when I pass away.  I can't imagine my kids preserving my stuff for more than a generation.
Title: Re: Data hoarders
Post by: Max Rockatansky on August 09, 2023, 11:01:22 PM
I know where my highway signs will probably go, I've bought way too many from estate sales to not know that answer.  Regarding my photos, since they aren't in a physical album they likely will be thrown in the trash or be purged from the online sources they are online.  At minimum some of the stuff I've written about might live through the years.  Given I don't have kids' general memory of me will likely fade within a couple decades.  As an example, I noticed US 66 ending at Fletcher Drive is already starting to make the rounds into more normalized circles of road interests. 
Title: Re: Data hoarders
Post by: zachary_amaryllis on August 10, 2023, 07:40:46 AM
Quote from: bugo on August 09, 2023, 08:58:21 PM
I am a data hoarder. I have tons of pictures, text files. audio files and other documents saved on my hard drives. Many of these documents are maps and other road-related documents such as meeting minutes, route logs and orders.

When a state highway department posts an updated map on their website, the old map is almost always removed. And when they're removed, you usually can't get them anywhere else unless somebody has saved it and sends it to you. That's why data hoarding is important.

We won't even start on mapping programs like Google Maps that are constantly being updated. It's easy to buy a 2010 paper road atlas, but with mapping software, there are no old editions to archive. This will make it more difficult for future road researchers to try to pin the date of the opening of a road by using maps.

Does anybody else obsessively archive maps and other road-related documents?

I'm a datahoarder, though it's different data that I hoard.
Every 10 minutes, I (well, the cron job..) grabs pictures from 3 local radio towers. I've been doing this for almost a year now.

I have 3 cameras on the house, looking at different mountainy stuff, that snap a picture once every minute. That's been going on for about 6 months now. (I like to edit them into timelapses).

I have an older machine, that I've repurposed into sorta a NAS, it just sits there and spins the drives. Every last image is saved. No idea what I'll do with them, but it was fun to set up as sort of 'proof of concept'.
Title: Re: Data hoarders
Post by: HighwayStar on August 17, 2023, 12:11:27 AM
I've become more of a data hoarder with time. The biggest contributor to this has been a realization that the mantra of the 90's/2000's when people assumed information on the internet was permanent has turned out to be incorrect. Link rot, site bankruptcies etc. have long proved otherwise. Mind you, I don't care about everything, but generally when I have read an article I like to be able to refer back to it.

A year or so ago I went though all my bookmarks of what I had saved over the years, and in the oldest tranche of data, about 7-8 years old, probabally 40% of the links were rotten. About half of those could be recovered by searching the article title, but I bet a solid 20% were difficult or impossible to recover.

After that experience I have moved to saving articles with citation management software so I can always go back if needed.

Likewise, YouTube in the early years was an excellent repository of TV shows that were otherwise impossible to find, but has significantly declined as an archive platform.
Title: Re: Data hoarders
Post by: ZLoth on August 17, 2023, 12:58:40 AM
This is partially why I built a TrueNAS server which is partially a backup storage server, partially a media server, and partially a file storage server. With eight 8 TB drives in a RAIDZ2 configuration, that gives me effectively 48TB of storage.

Data hoarding? Of course, when you consider I have:
Yes, it's all backed up to several external physical hard drives. However, some critical and/or non-replaceable items (system backups, photos, personal files, KeePass (https://markholtz.info/keepass)) are backed up every night to encrypted folders on two different cloud storage providers.

Quote from: HighwayStar on August 17, 2023, 12:11:27 AMI've become more of a data hoarder with time. The biggest contributor to this has been a realization that the mantra of the 90's/2000's when people assumed information on the internet was permanent has turned out to be incorrect. Link rot, site bankruptcies etc. have long proved otherwise. Mind you, I don't care about everything, but generally when I have read an article I like to be able to refer back to it.

A year or so ago I went though all my bookmarks of what I had saved over the years, and in the oldest tranche of data, about 7-8 years old, probabally 40% of the links were rotten. About half of those could be recovered by searching the article title, but I bet a solid 20% were difficult or impossible to recover.

This is why I still collect physical media, and transfer the CDs, DVDs, and BluRays to my media server even though some folks say "you can just watch it online". For starters, there is some niche material that I have which never made it online. The anime series Planetes (2003) was released on DVD almost two decades ago (publisher no longer exists), was released in Region B on BluRay in 2021, and is not even available on Crunchyroll. Another international dark comedy film 6ixtynin9 (1999) was nominated for a foreign language Oscar, but is long out of print in DVD (bankruptcy). In addition, there have been series and movies which have been notably yanked from the streaming services by the studios for "cost reasons", especially Warner Bros-Discovery but also Disney. Lord knows how many streaming services I would need to subscribe to in order to view what I have on my own private server.

Quote from: HighwayStar on August 17, 2023, 12:11:27 AMI have moved to saving articles with citation management software so I can always go back if needed.

Got any examples that I can look at?
Title: Re: Data hoarders
Post by: Chris on August 17, 2023, 04:06:10 AM
I've been doing highway history research in Europe, and often the most difficult timespan to find dates from is not from 50 years ago, but the 1995-2010 timeframe. It's surprising how little media archives are from that period. Archived newspapers mostly don't go beyond the 1990s while many websites running before 2010 are completely revamped with lots of data not being available anymore.

Google isn't helping here either. It prioritizes recent news reports over anything else, making it difficult to find even something from more than 5 years ago. The best documentation may be internet forums, but Google has clearly deranked internet forums from its search results. Even discussions running for 10+ years on internet forums can only be found on the 4th or 5th page of results.

However Google still has the most results. An attempt to search on Bing or DuckDuckGo leaves even less relevant results.
Title: Re: Data hoarders
Post by: Max Rockatansky on August 17, 2023, 08:02:22 AM
If we are counting highway research as data hoarding then I've done my fair share for California.  A couple years ago I began inserting images from CHPWs, maps, historic images and other sources into the California blogs on Gribblenation.  I'm closing in on an almost complete set of representative blogs for all the state highways.  Personally I'm more proud of blogs for the more obscure times like the Stockton-Los Angeles Road, Lone Pine-Porterville High Sierra Road and Mineral King Road:

https://www.gribblenation.org/p/golden-state-highways-version-30.html?m=1
Title: Re: Data hoarders
Post by: mgk920 on August 17, 2023, 01:00:44 PM
Also the absurdly long copyright protection terms in the USA, with rights holders dumping their stuff in a memory hole if it is somehow 'un-PC' in current standards or for any other reason and then suing the daylights out of any leaks, even if the material is multi-decades old.  My sense is that much of our very culture is being lost for this reason.

Mike
Title: Re: Data hoarders
Post by: MikeTheActuary on August 17, 2023, 01:23:11 PM
I have a bad habit of downloading stuff (not specifically roadgeek) and just not deleting it, or of making redundant backups because I don't take the time to remember whether or not I actually need that particular backup.  Every picture or video on my or my wife's phone automatically gets saved...even the accidental photos, or the photos I take of computer/radio/electronics wiring to better see what's going in typically dark, tight spaces....

That's probably why my NAS is currently complaining about running out of disk space, and why I have a weekend project of upgrading the hard drives in it. :D

Most of my data hoarding is just my not being able to motivate myself to go through and figure out what I want/need to keep, and what can be purged.  But there has been a time or two that I've wanted to refer to something random I once saw, and searching through my archives has proven pleasantly fruitful.
Title: Re: Data hoarders
Post by: 1995hoo on August 17, 2023, 01:48:04 PM
Quote from: Rothman on August 09, 2023, 10:52:50 PM
I am old enough now to start wondering where everything will go when I pass away.  I can't imagine my kids preserving my stuff for more than a generation.

I've pondered how to deal with my late father's photos from family vacations over the years (he died in 2019). He usually shot slides and he had a slide projector (that no longer works) of a style that has been discontinued, a system that used cubes instead of the more common carousel, although later on he got a second projector that used a carousel. What that means as a practical matter is that trying to view the slides on a wall or a screen in the traditional manner would be difficult. I'd like to scan them all sometime–in particular, I'd like to compare the pictures from our 1982 family trip to Nova Scotia and Newfoundland with my pictures from my 2008 trip to some of the same places. But I don't really have a good sense for how to approach the project because I would like to find some semi-automated way to do it, like if I could use a slide scanner and stack up a full cube's worth of slides and have the machine scan them all automatically. Ideally, I'd find a slide scanner that can also scan negatives so that I could scan my old negatives from my film camera days. I don't know what sort of thing exists for that sort of project and I know it would be time-consuming either way (maybe a good wintertime project), but the alternative–paying a service to do it–would be way too expensive.

Does anyone know much about slide and negative scanners? I suppose I could go to B&H in New York when I'm up there next week, but I have a feeling going there would be more worthwhile if I had some idea of what I want to look for before I go.

I'm guessing if I pursue this I might set up a separate PC to be the photo storage machine.
Title: Re: Data hoarders
Post by: J N Winkler on August 17, 2023, 02:30:52 PM
Quote from: 1995hoo on August 17, 2023, 01:48:04 PMI've pondered how to deal with my late father's photos from family vacations over the years (he died in 2019). He usually shot slides and he had a slide projector (that no longer works) of a style that has been discontinued, a system that used cubes instead of the more common carousel, although later on he got a second projector that used a carousel. What that means as a practical matter is that trying to view the slides on a wall or a screen in the traditional manner would be difficult. I'd like to scan them all sometime–in particular, I'd like to compare the pictures from our 1982 family trip to Nova Scotia and Newfoundland with my pictures from my 2008 trip to some of the same places. But I don't really have a good sense for how to approach the project because I would like to find some semi-automated way to do it, like if I could use a slide scanner and stack up a full cube's worth of slides and have the machine scan them all automatically. Ideally, I'd find a slide scanner that can also scan negatives so that I could scan my old negatives from my film camera days. I don't know what sort of thing exists for that sort of project and I know it would be time-consuming either way (maybe a good wintertime project), but the alternative–paying a service to do it–would be way too expensive.

Does anyone know much about slide and negative scanners? I suppose I could go to B&H in New York when I'm up there next week, but I have a feeling going there would be more worthwhile if I had some idea of what I want to look for before I go.

I'm guessing if I pursue this I might set up a separate PC to be the photo storage machine.

I do have some experience with slide scanning, but it is very dated, since I lost the capability about 17 years and three computers ago.

When I was still actively doing it, I used two Benq slide scanners that could handle four slides at a time and could also scan negatives.  Both devices required installation of a SCSI card.  I got the second because the first could not remove dust specks in scans, and I soon realized that capability was essential--otherwise you waste an enormous amount of time per scan Photoshopping out dust motes.  I also discovered that the color balance in scans tilted noticeably toward the red as the scanner warmed up, so I had to put a block of frozen cheese on top in order to ensure consistent results over the course of a session.

Unless the technology has improved considerably, scanning slides is a huge time pit, since you have to manually load the slide carrier and be prepared to change out the slides once the scanner finishes a TWAIN acquisition.  I have never heard of or seen a consumer-grade scanner with the ability to scan a whole slide carousel.

I generally pre-selected a variable number of slides from each roll to scan, so the total time to process might run from just a few minutes to over an hour.  I used a battery-powered slide viewer for looking at slides and eventually burned out the bulb.  Since I could not easily find a replacement, I then had to hold slides up to the light, which is not ideal for evaluating tone and contrast.  Since my slide processor did not number slides, I ran a marker diagonally over the top edge of each row (there were two in each box) so that I could re-file scanned slides in order.

I never came close to processing even half of the approximately 400 rolls I shot.  I do actually still have the slide scanners, and even an old desktop computer that should be able to accept a SCSI card, but I don't know whether I would still be able to find compatible drivers (I was running Windows 98 at the time and the desktop runs Windows XP out of the box).  Frankly, this is not a project I want to undertake now or in the immediately foreseeable future.  When I do eventually tackle it, I want to do some in-depth research into what it would take to obtain professional-grade results (which I did not do, or even have the ability to do, 20 years ago), and possibly invest in new equipment.
Title: Re: Data hoarders
Post by: 1995hoo on August 17, 2023, 02:52:14 PM
Thanks for that info. Not encouraging to hear, but I guess that's important to know as well. Maybe I'll swing by B&H Monday night to ask about it, recognizing of course their interest is in upselling me to something more than I really need.
Title: Re: Data hoarders
Post by: J N Winkler on August 17, 2023, 03:05:49 PM
Quote from: 1995hoo on August 17, 2023, 02:52:14 PMThanks for that info. Not encouraging to hear, but I guess that's important to know as well. Maybe I'll swing by B&H Monday night to ask about it, recognizing of course their interest is in upselling me to something more than I really need.

It certainly wouldn't hurt to do a preliminary investigation of feasibility.  There is still considerable demand for image capture from legacy materials, and I suspect there have been major improvements in speed and capability over the past couple of decades.

The size of the collection you are working with is also a factor.  I've determined that my parents and paternal grandparents also shot slides, but I would expect that at least 80% of the rolls I might be working with down the line are mine.  For the time being, I have focused more on ensuring that everything is indexed so I can establish context later on.  Every box of my slides has a number, and for most I can pinpoint (at least approximately) the month and year I shot it.
Title: Re: Data hoarders
Post by: J N Winkler on August 17, 2023, 03:32:33 PM
In regard to the general archiving question in the OP, I would estimate that about 95% of what I have (by byte count) consists of document packages for highway construction contracts.  I also gather the same for some rail (especially high-speed) and urban transit projects, though far less extensively and systematically.

For some--not all--of the agencies for which I collect material, I extract pattern-accurate signing sheets, according to criteria that vary from one agency to another but generally call for sign layouts only if the agency in question does not routinely provide sign panel detail and sign elevation sheets.  These days I try to limit the effort I invest in this on an ongoing basis, so for some agencies I have discontinued it, though I still maintain collection in hopes eventually of using computer vision to zero in on signing.

I have about 127 GB worth of signing sheets, sorted by agency.  I have over 25,000 (about 6.25 GB total) for TxDOT alone.  Just in the last month, I crossed the threshold of 20,000 (21.1 GB) for German highway agencies as a whole.

(https://i.imgur.com/pyVC7Hf.png)

(https://i.imgur.com/91hYawr.png)
Title: Re: Data hoarders
Post by: dlsterner on August 17, 2023, 05:10:43 PM
Quote from: 1995hoo on August 17, 2023, 01:48:04 PM

I've pondered how to deal with my late father's photos from family vacations over the years (he died in 2019). He usually shot slides and he had a slide projector (that no longer works) of a style that has been discontinued, a system that used cubes instead of the more common carousel, although later on he got a second projector that used a carousel. What that means as a practical matter is that trying to view the slides on a wall or a screen in the traditional manner would be difficult. I'd like to scan them all sometime–in particular, I'd like to compare the pictures from our 1982 family trip to Nova Scotia and Newfoundland with my pictures from my 2008 trip to some of the same places. But I don't really have a good sense for how to approach the project because I would like to find some semi-automated way to do it, like if I could use a slide scanner and stack up a full cube's worth of slides and have the machine scan them all automatically. Ideally, I'd find a slide scanner that can also scan negatives so that I could scan my old negatives from my film camera days. I don't know what sort of thing exists for that sort of project and I know it would be time-consuming either way (maybe a good wintertime project), but the alternative–paying a service to do it–would be way too expensive.

Does anyone know much about slide and negative scanners? I suppose I could go to B&H in New York when I'm up there next week, but I have a feeling going there would be more worthwhile if I had some idea of what I want to look for before I go.

I'm guessing if I pursue this I might set up a separate PC to be the photo storage machine.

About 10 years ago when my mother passed I grabbed the family 35mm slide collection with the intent of digitizing them and saving them to a CD-R or DVD-R.  About 2000 slides from 1957 through 1986 (when my father passed), all meticulously noted with date and place.

I found a scanner on Amazon back then - it was $120-ish - that would scan slides (one at a time) and save them to a SD card which plugged into the scanner.  I could then connect the scanner to my Mac (certainly a PC works too) where it would behave like a USB drive and I could copy the files to my hard drive.  (I did see one that was automated but it was like $1600 - yikes)

I'm not home right now so I don't have the information about the scanner, but I'll check when I get home and edit this post (or post a reply).

Sadly a handful of slides had faded to the point where all the blue and green information was gone, just the red remained.  Maybe it was the brand of film.  Those I then converted to grayscale with a photo editing program.  Salvageable, and better than nothing.
Title: Re: Data hoarders
Post by: Dirt Roads on August 17, 2023, 05:21:08 PM
Quote from: J N Winkler on August 17, 2023, 03:32:33 PM
In regard to the general archiving question in the OP, I would estimate that about 95% of what I have (by byte count) consists of document packages for highway construction contracts.  I also gather the same for some rail (especially high-speed) and urban transit projects, though far less extensively and systematically.

For some--not all--of the agencies for which I collect material, I extract pattern-accurate signing sheets, according to criteria that vary from one agency to another but generally call for sign layouts only if the agency in question does not routinely provide sign panel detail and sign elevation sheets.  These days I try to limit the effort I invest in this on an ongoing basis, so for some agencies I have discontinued it, though I still maintain collection in hopes eventually of using computer vision to zero in on signing.

I have about 127 GB worth of signing sheets, sorted by agency.  I have over 25,000 (about 6.25 GB total) for TxDOT alone.  Just in the last month, I crossed the threshold of 20,000 (21.1 GB) for German highway agencies as a whole.

I've been downloading rail transit planning information since the early days of the Internet.  Some of it was originally stored on 5-1/4 inch floppies and converted to 3-1/2 hard floppies.  I've also got a ton of really old stuff on Zip drives.  Over the years, I've filled up several 4gB hard drives full of data.  But recently I've been on a Road Diet (deleting the old Roadgeek stuff) and will formally begin a Rail Diet next month (after my 10-year safety-and-security protection clause expires).  What is odd is that I've never really needed to look up any old project information, but I'm so forgetful that I often can't understand how I remembered some of the information when asked.  (Which means, I did go back and check several times).  I've got so much stuff that I usually can't find what I'm looking for (cueing up the U2 song).

For those wondering if they should become a Data Hoarder...

The firm that I worked for was increasingly pressed to produce cost estimates for almost everything where people get moved (trains, buses, vans, taxis, moving walkways, elevators, etc.) and we didn't have any semblance of a cost database.  As I started to collect data for a complex project, I realized that we had access to many resources that just needed catalogueing.  If there was an article with a dollar $ sign (or pound £ sign, if you are so inclined), it got collected and added to a lengthy document.  Then I started working backwards in time.  We got really good at cost estimating without having any Professional Cost Estimators on staff (after all, the folks hiring us almost always had them).  But it was an effort you could never stop.  Eventually, I got where the demands of safety certification would not allow me to continue being a cost wizard.  But this Data Hoarding technique would be quite useful in almost any field of business.
Title: Re: Data hoarders
Post by: HighwayStar on August 18, 2023, 11:14:08 AM
Quote from: ZLoth on August 17, 2023, 12:58:40 AM

Quote from: HighwayStar on August 17, 2023, 12:11:27 AMI have moved to saving articles with citation management software so I can always go back if needed.

Got any examples that I can look at?

I originally used Mendeley and Zotero. I've abandoned Mendeley however because Elsevier has endeavored to make it hard to transfer the library to other software. Zotero is much better in that respect and has become by sole solution. Main downside has been backing up and synchronizing the library across devices. Its a bit tricky to get it playing nicely with cloud storage.
Title: Re: Data hoarders
Post by: HighwayStar on August 18, 2023, 11:22:08 AM
Quote from: 1995hoo on August 17, 2023, 01:48:04 PM
Quote from: Rothman on August 09, 2023, 10:52:50 PM
I am old enough now to start wondering where everything will go when I pass away.  I can't imagine my kids preserving my stuff for more than a generation.

I've pondered how to deal with my late father's photos from family vacations over the years (he died in 2019). He usually shot slides and he had a slide projector (that no longer works) of a style that has been discontinued, a system that used cubes instead of the more common carousel, although later on he got a second projector that used a carousel. What that means as a practical matter is that trying to view the slides on a wall or a screen in the traditional manner would be difficult. I'd like to scan them all sometime–in particular, I'd like to compare the pictures from our 1982 family trip to Nova Scotia and Newfoundland with my pictures from my 2008 trip to some of the same places. But I don't really have a good sense for how to approach the project because I would like to find some semi-automated way to do it, like if I could use a slide scanner and stack up a full cube's worth of slides and have the machine scan them all automatically. Ideally, I'd find a slide scanner that can also scan negatives so that I could scan my old negatives from my film camera days. I don't know what sort of thing exists for that sort of project and I know it would be time-consuming either way (maybe a good wintertime project), but the alternative–paying a service to do it–would be way too expensive.

Does anyone know much about slide and negative scanners? I suppose I could go to B&H in New York when I'm up there next week, but I have a feeling going there would be more worthwhile if I had some idea of what I want to look for before I go.

I'm guessing if I pursue this I might set up a separate PC to be the photo storage machine.

I've done them, using a prosumer flatbed scanner. You can get a nice one with slide scanning capabilities used now days for not a lot of money. Go high end. For example, I bought a CanoScan 9000F for $60 on OfferUp earlier this year. That was ~$900 scanner when new and will do an excellent job on slides.

As others have said, it is a bit time consuming to do right, but the truth is ALL analog to digital conversion is time consuming to do right.

By far the most important lesson learned over the  years is to spend lots of time up front figuring out the workflow. For slides this is something along the lines of scanning at high resolution and bit depth in a lossless format. Also organize the slides by date and sequence (both of which should be on the frames in most cases) and scan in that order and name accordingly to make organization easy.

Before I understood this, I did conversion projects where I scanned at low resolution and in JPEG. I had to redo those. Time spent getting a good workflow that produces results you can live forever with is important.

It will take some time to sit and load the scanner that many times, but if you set it up next to you at the desk its possible to do a fair bit of multitasking and get some of the time back for other things.
Title: Re: Data hoarders
Post by: HighwayStar on August 18, 2023, 11:35:17 AM
Quote from: J N Winkler on August 17, 2023, 03:05:49 PM
It certainly wouldn't hurt to do a preliminary investigation of feasibility.  There is still considerable demand for image capture from legacy materials, and I suspect there have been major improvements in speed and capability over the past couple of decades.

Unfortunately that has generally not been the case for many legacy formats. The best equipment for capture has in many cases actually been discontinued and is only available on the used market at greatly inflated prices. Slides are fortunately an exception to this trend, in that good quality scanners are still being made, but for many other formats that is not the case. VHS for example is a nightmare to capture at this point and was easier to do 15 years ago.

That said, the best way to scan slides is likely going to be a flatbed scanner, loaded a few at a time. Any contraption that someone has come up with to take the whole carousel I would be quite wary of.
Title: Re: Data hoarders
Post by: J N Winkler on August 18, 2023, 02:26:51 PM
Quote from: Dirt Roads on August 17, 2023, 05:21:08 PMI've been downloading rail transit planning information since the early days of the Internet.  Some of it was originally stored on 5-1/4 inch floppies and converted to 3-1/2 hard floppies.  I've also got a ton of really old stuff on Zip drives.  Over the years, I've filled up several 4gB hard drives full of data.  But recently I've been on a Road Diet (deleting the old Roadgeek stuff) and will formally begin a Rail Diet next month (after my 10-year safety-and-security protection clause expires).  What is odd is that I've never really needed to look up any old project information, but I'm so forgetful that I often can't understand how I remembered some of the information when asked.  (Which means, I did go back and check several times).  I've got so much stuff that I usually can't find what I'm looking for (cueing up the U2 song).

Did you explore the option of simply copying this material over to a mass storage device?  At nominal capacities of 360 KB (5 1/4" floppy), 1.44 MB (3 1/2" floppy), 100 MB (Zip drives), and 4 GB for a handful of mass storage devices, this sounds like a collection that could easily fit on a lightweight 2 TB drive costing well under $100.

Quote from: Dirt Roads on August 17, 2023, 05:21:08 PMFor those wondering if they should become a Data Hoarder...

A few lessons I've learned (some high-level, some low-level):

*  Keep things organized

*  Automate organization as much as possible

*  Don't neglect metadata generation

*  Automate metadata generation as much as possible

*  When automating acquisition, use regex-capable tools (even if this means using ports of Unix shell commands in a Windows environment)

*  Work in Unicode (with en-dashes appearing in filenames, pure ASCII is not safe even in 100% US-based contexts)

*  Log source URLs, server addresses, etc. (useful for backtracing documents using the Web Archive)

*  If you use scripts that need revision from time to time, save the old versions

Quote from: HighwayStar on August 18, 2023, 11:22:08 AMBy far the most important lesson learned over the  years is to spend lots of time up front figuring out the workflow. For slides this is something along the lines of scanning at high resolution and bit depth in a lossless format. Also organize the slides by date and sequence (both of which should be on the frames in most cases) and scan in that order and name accordingly to make organization easy.

IMV, the part in boldface cannot be stressed enough.
Title: Re: Data hoarders
Post by: Dirt Roads on August 20, 2023, 03:22:46 PM
Quote from: Dirt Roads on August 17, 2023, 05:21:08 PMI've been downloading rail transit planning information since the early days of the Internet.  Some of it was originally stored on 5-1/4 inch floppies and converted to 3-1/2 hard floppies.  I've also got a ton of really old stuff on Zip drives.  Over the years, I've filled up several 4gB hard drives full of data.  But recently I've been on a Road Diet (deleting the old Roadgeek stuff) and will formally begin a Rail Diet next month (after my 10-year safety-and-security protection clause expires).  What is odd is that I've never really needed to look up any old project information, but I'm so forgetful that I often can't understand how I remembered some of the information when asked.  (Which means, I did go back and check several times).  I've got so much stuff that I usually can't find what I'm looking for (cueing up the U2 song).

Quote from: J N Winkler on August 18, 2023, 02:26:51 PM
Did you explore the option of simply copying this material over to a mass storage device?  At nominal capacities of 360 KB (5 1/4" floppy), 1.44 MB (3 1/2" floppy), 100 MB (Zip drives), and 4 GB for a handful of mass storage devices, this sounds like a collection that could easily fit on a lightweight 2 TB drive costing well under $100.

The rationale for keeping the floppies was that I also maintained the original x386, x486 and early Pentium computers that could still run the older versions of software.  One of the problems was that we worked on too many software platforms that didn't port over from one generation to another.  Many of those software platforms didn't survive past two generations of hardware (about 6 years) and two generations of Windows (about 5 years), overlapping (on average about 7-1/2 years).  Ouch, even MS-Fortran and the original Visual Basic didn't migrate with the newer stuff.  Plus, it was a real pain to reinstall VB 1.0, upgrade to VB 1.1, V.B 1.3, VB 2.0 and VB 3.0 just to find out that the newer Windows add-ons/add-ins would not allow the old programs to operate.  I could have bought a separate new computer and downgraded to the older versions of Windows, but it was much easier to add extra cooling power to the old computers.

Quote from: J N Winkler on August 18, 2023, 02:26:51 PM
Did you explore the option of simply copying this material over to a mass storage device?

Some of the really old stuff has been ported over to the disk drives.  For the record, I maintain an online computer (with normal data) and an offline computer (with private data and proprietary data).  I have two 4GB backup drives, two 2GB backup drives, three 1GB backup drives (one bad), two old offline computers with 1GB drives and two old offline computers with 720MB drives.  At one time, almost all of those drives were full (except for the old Windows Vista machine that crashed and I had to replace its hard drive, and even that one got half-full at one point).  Plus, I still have two really old 350MB plug-in-wall drives (in the days before they could work on USB power).

After the most recent cleanup, I can probably store everything on a 2GB drive.  I do try to alternate the drives when I perform frequent manual backups and annual archivals.  Which leads to your next comment...

Quote from: J N Winkler on August 18, 2023, 02:26:51 PM
A few lessons I've learned (some high-level, some low-level):

*  Keep things organized

*  Automate organization as much as possible

*  Don't neglect metadata generation

*  Automate metadata generation as much as possible

I have a pretty good Windows File architecture for everything.  But sadly, with many years of 100+ hour workweeks, I never had any time to keep the backup storage organized.  Much of the Data Hoarding efforts involved backups-over-backups, backups-of-backups (ouch) and even sometimes archives-of-archives (double ouch).  If you couldn't already tell, I always tried to keep two independent backups such that in the case of another crash of a backup drive I don't lose too much stuff.  After the next cleanup, I shouldn't need to keep two sets of backups.

I've never trusted the automated backup software and metadata software with personal data and proprietary data (too much alien "E.T., phone home" going on inside there).  If it weren't for my wife and I both generating photo albums that need to be consolidated occasionally, I would purchase smaller drives and automate the backup process on both machines and quit worrying about Google collecting a copy of everything.  I still find things that never made it into my archives system.

There's no room here for all of this stuff.  But what is scary about all of this is how much money has been poured into computers, software and backup drives over the course of my lifetime.  Given that I'm a big cheap-O and have probably spent less than a tenth of what everybody else who had personal computers in the 1970s, I can't even imagine what they have spent.
Title: Re: Data hoarders
Post by: Scott5114 on August 21, 2023, 12:58:11 AM
I imagine there are still audio/video or film processing companies in most cities that will scan slides and return both the slides and digital image files to you for a fee. My parents used a local company in Oklahoma City to transfer VHS tapes to DVD back in the day. I would be surprised if there wasn't someone offering the same service for slides, especially in the DMV area. Wouldn't hurt to check, at least...the fee would probably be less than the cost of getting your own equipment, never mind the time investment.
Title: Re: Data hoarders
Post by: formulanone on August 21, 2023, 06:20:03 AM
Quote from: Scott5114 on August 21, 2023, 12:58:11 AM
I imagine there are still audio/video or film processing companies in most cities that will scan slides and return both the slides and digital image files to you for a fee. My parents used a local company in Oklahoma City to transfer VHS tapes to DVD back in the day. I would be surprised if there wasn't someone offering the same service for slides, especially in the DMV area. Wouldn't hurt to check, at least...the fee would probably be less than the cost of getting your own equipment, never mind the time investment.

Jim Grey has a pretty good write-up on a slide scanner from 2014. (https://blog.jimgrey.net/2014/08/15/wolverine-super-f2d/) He's a part-time member on these boards.

Converting slides yourself is a slow process; you can buy a slide reader which is effectively a lower-end digital camera with a light source and cover. If you're just sharing them for fun on social media without worrying about the archival image depth and preservation, then go for it. The problem with the cheaper ones seem to be vignetting in the corners and less-than accurate color reproduction, since the light emitted from the slide reader has its own color cast. You also have to carefully fit each one into the reader and may probably want to adjust lighting, colors, and perhaps perform "dust and scratches" clean-up work.

There's higher-end equipment and also professionals who will do the work for you. Like pretty much anything in the sphere of photography, it just depends on how much you want to spend, and how noticeable the differences are valued. Though, someone who is familiar with the slide images is going to know and remember the colors used with a little more care than someone almost entirely emotionally unattached from the process. Slide film was the closest thing to a compact high-definition image and some of the results of 70+year-old slides are still pretty amazing to what most prosumer cameras of the last decade can yield.

I'm surely going inherit hundreds of old family slides from the 1960s-80s one day, but have no idea how I'm going to do it without breaking the bank. I still have a few thousand photos to about 2002 or so that I have yet to feed into a scanner. I want quality, but I also just wonder if anyone's going to even care a few years after I'm gone...on the other hand, maybe the quality will be something to treasure in the future. (Sorry to get sappy there.)
Title: Re: Data hoarders
Post by: 1995hoo on August 21, 2023, 07:05:30 AM
Quote from: Scott5114 on August 21, 2023, 12:58:11 AM
I imagine there are still audio/video or film processing companies in most cities that will scan slides and return both the slides and digital image files to you for a fee. My parents used a local company in Oklahoma City to transfer VHS tapes to DVD back in the day. I would be surprised if there wasn't someone offering the same service for slides, especially in the DMV area. Wouldn't hurt to check, at least...the fee would probably be less than the cost of getting your own equipment, never mind the time investment.

I did look into that and from what I found, I'd be looking at at least several thousand. Hence why I was exploring DIY options! I will concede the point formulanone makes about whether anyone will care in the future has crossed my mind. My wife and I don't have any kids and my brother is not married (as far as I know, he doesn't have any kids he knows about, at least...) BUT as it is now, the slides all sit in cabinets in my mother's living room.

I think the point HighwayStar makes about figuring out the best way to run a slide or negative scanner while doing other things is exceptionally important!
Title: Re: Data hoarders
Post by: HighwayStar on August 21, 2023, 04:31:38 PM
Quote from: 1995hoo on August 21, 2023, 07:05:30 AM
Quote from: Scott5114 on August 21, 2023, 12:58:11 AM
I imagine there are still audio/video or film processing companies in most cities that will scan slides and return both the slides and digital image files to you for a fee. My parents used a local company in Oklahoma City to transfer VHS tapes to DVD back in the day. I would be surprised if there wasn't someone offering the same service for slides, especially in the DMV area. Wouldn't hurt to check, at least...the fee would probably be less than the cost of getting your own equipment, never mind the time investment.

I did look into that and from what I found, I'd be looking at at least several thousand. Hence why I was exploring DIY options! I will concede the point formulanone makes about whether anyone will care in the future has crossed my mind. My wife and I don't have any kids and my brother is not married (as far as I know, he doesn't have any kids he knows about, at least...) BUT as it is now, the slides all sit in cabinets in my mother's living room.

I think the point HighwayStar makes about figuring out the best way to run a slide or negative scanner while doing other things is exceptionally important!

Unfortunately the paid conversion services are mostly crap and best avoided.
The problem is to do good quality conversions of any old analog media to digital takes specialized knowledge, attention to detail, and a decent amount of time, along with expensive and sometimes rare equipment. The actual cost to do it correctly will almost always be well above what most people would pay, so the only way to make a business out of it is to hire minimum wage labor that has no clue what they are doing and churn out poor quality conversions using substandard equipment. They get away with it because people invariably attribute poor quality to the "old tape/film/slides/etc." not knowing that they should look far better if done right.
There might be a local service here or there that can do it, but don't even consider giving them your business unless they can give plenty of details about the process, what they do or do not do, equipment used, etc. and pass all those tests. The only places I ever tried that with were evasive, vague, or simply told me that they would not discuss it.
Here is an excellent expose of Legacybox done by VWestlife which should disabuse anyone from ever being tempted by that service. I suspect many others would be a similar story.


Doing it yourself will not only save money but likely result in vastly superior quality.
Title: Re: Data hoarders
Post by: bugo on August 22, 2023, 09:59:14 AM
Here's a small part of my map hoard. This Google Drive folder contains several hundred maps, most in convenient PDF format but a few TIF and even some JPG files are in the folder including state map archives for several states, Arkansas and Oklahoma control section maps (including several that are no longer available on the ArDOT or ODOT websites) and, in the folder Atlases, some road atlases going as far back as 1926.

https://drive.google.com/drive/folders/1DXfHOykUwntSdiO56f4tto4ivXpAh_ep?usp=drive_link