Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

Sort GBs of data - archive

Here you can discuss every aspect of Debian. Note: not for support requests!
Post Reply
Message
Author
User avatar
jalisco
Posts: 94
Joined: 2013-09-01 17:30

Sort GBs of data - archive

#1 Post by jalisco »

Hi,

I have tons of images, having saved many "quick and dirty backups" over the years.

In other words, often trying to keep it organized, but over time (a decade), I just save the old hard drive, and replace it with a new one.
During this process, I have many "Photos" folders, from different systems, Apple and Debian (Shotwell) predominantly, in multiple directories.

What I would like to do is consolidate all the images, erring on the side of caution -- I don't want to lose any images, if possible.

For example, I have say
Directory One -> Photos with 75GB
Directory Two -> Photos with 50 GB
Directory Three -> Photos 20GB

so on and so forth.

Programs like iPhoto/Photos use databases and have all kinds of extra crap (like "faces" and thumbnails) that I am not necessarily interested in.

At this point, I just want to organize all my images, without having to manually go through 100,000+ of many redundant images.

I use Debian predominantly and is my main system for some time. There may be 2-3 instances of iPhoto images, based on my partners usage of Mac, and really old images, of when I used Apple products.

My plan is:

Well, now that I start to formulate it, I realize, I have no plan =/

I know I can use rsync to synchronize the directories.

But, ideally, I really want all the files in one big fat directory. I don't really care about the funny "by year, or some other funny camera directory/sub-directory style". I just want all the image files.

So, then, I guess I should do some "bashing", use BASH to get all the images from the various subdirectories, recursively, and move them into one big folder.

Both of these possibilities leave me lacking complete comfort, because I am not sure how they handle duplicates.

Thinking out loud, I guess I would want to:

1. create a BASH script, using the mv command with the backup option --> to recursively get all the image files from their various folder structures and subdirectories, into one massive directory.
2. then use fdupes to remove the duplicates.


Is there a heavy duty application that can take all the image in, and manage them, that anyone know of, to make this process easier??

Programs like Photos even Shotwell, simply can't handle this volume of images very efficiently.

User avatar
acewiza
Posts: 357
Joined: 2013-05-28 12:38
Location: Out West

Re: Sort GBs of data - archive

#2 Post by acewiza »

"Organizing" is always a challenge. Here's a little snippet I use to find dupes:

Code: Select all

find . -type f -exec md5sum '{}' ';' | sort | uniq --all-repeated=separate -w 15 > dupes.txt
Other than that, I think you are pretty much on your own.
Nobody would ever ask questions If everyone possessed encyclopedic knowledge of the man pages.

User avatar
jalisco
Posts: 94
Joined: 2013-09-01 17:30

Re: Sort GBs of data - archive

#3 Post by jalisco »

acewiza wrote:"Organizing" is always a challenge. Here's a little snippet I use to find dupes:

Code: Select all

find . -type f -exec md5sum '{}' ';' | sort | uniq --all-repeated=separate -w 15 > dupes.txt
Other than that, I think you are pretty much on your own.

Thanks. That's kind of what I figured. Unfortunately, at some point, I passed the "consumer level" problem, into
"professional level" problem =)

User avatar
bw123
Posts: 4015
Joined: 2011-05-09 06:02
Has thanked: 1 time
Been thanked: 28 times

Re: Sort GBs of data - archive

#4 Post by bw123 »

The 'one massive directory" doesn't sound fun to me, unless the file names are all consistent that would be a mess wouldn't it?

I don't have nearly that big a problem, but I like arranging things by year, and once the dupes are gone, I use a desktop search app to find things.
resigned by AI ChatGPT

Post Reply