I guess they must have all been duplicate emails or something. It's definitely going to be interesting to see how people react to this, though.
Weren't the FBI initially investigating somewhere in the region of 35k e-mails, whereas the thing that sparked off the recent re-opening was finding 650k e-mails on Weiner's laptop? So at the very most, something like 5% of them could be duplicates. And I believe the FBI specifically mentioned they'd found articles to/from HRC that hadn't been handed over in the original set (despite the subpoena).
To even read 650k e-mails in the time since they 're-opened' the investigation and now, they'd have to have read 50 a minute, non-stop, in the last nine days. I have my doubts about how closely they have scrutinised these new e-mails.
Also on the e-mail note, one trick Podesta and the others used was to reply over the top of old e-mail chains and change the subject to something innocuous like 'Congratulations!' 'Happy birthday!', etc. The replies doing this would often be several months after the last e-mail in the chain. So you can have a chain of ten e-mails that originally had the subject 'Make sure no one finds out about this', but as soon as you get word that your e-mails are going to be looked at, you reply and change the subject to 'Congratulations' and add some twee message about one of the grandkids. To anyone skimming through, that's an unimportant e-mail chain, and it gets ignored.
1. First, Clinton's email server had a heck of a lot more than 35k emails. Probably in the millions, since running the state department is a big deal. 35k is what you get from being a university student who ordered Domino's Pizza online using their email account 10 years prior (I get one email from them every 3-4 days, and similar amounts from other marketing attached to ordering stuff online; that's 1000 emails per decade per source; and so I have 3,000 unread emails in my half decade old student email address that I basically never use). 35,000 was the approximate number not handed over, due to being deemed personal in nature by a third party law firm which sorted through them.
2. Emails are not magical. They do not fly through the ether unassisted on magic carpets of naivety and technological dreams. If an email arrived on this, or any other person's computer, it existed in the email server and existed at the recipient's email server (which could be the same one, or it could be gmail, etc).
This means that unless it was one of the emails deemed personal in nature by the law firm, it was either on the server or THERE WAS A GRAND CONSPIRACY TO DELETE IT. These are the only three options. Either it existed on the server and was a duplicate, or it did not and was deleted either by the law firm as an email of a personal nature or by a GRAND CONSPIRACY. It is very important to note this fact because it means that all non-duplicate emails are either giant thermonuclear stink-bombs or they are personal emails. There will be absolutely zero non-duplicate emails about relevant day-to-day running of things, and any such emails should and would send up red flags.
Likewise, there should be zero modifications to email content, because again, the only reason to find this would be:
A. something which was private, and explained as such (and, again, should be obviously private if a human read the un-modified version). I'm not sure this was actually possible or if the law firm got rid of personal emails on an all-or-nothing basis.
B. something which is a giant thermonuclear stink-bomb involving a GRAND CONSPIRACY.
3. People are not looking at the emails. People are awful at noticing things and do not have a photographic memory of a million emails to ensure they are the same. The emails are primarily being processed by software. Upon acquisition of the emails, this is how they are processed, more or less (or rather, how any sensible person would process them):
A. Emails are imported into a format compatible with FBI software for such cases, as the original set were.
B. A binary diff tool is run to toss out any exact duplicates -- this would throw out all emails which are exact matches of those already processed, checking subject, content, and metadata such as to/from and timestamps. This rules out all emails which are not either personal emails or the result of a GRAND CONSPIRACY.
C: If any emails are left, these are divided amongst investigators to look for any signs of GRAND CONSPIRACY, which includes: Emails that look like they are related to running things/government duties, Emails that do not look like they are of a personal nature.
Because step B rules out all but non-duplicated, personal emails, there should be a very small number of them. Essentially consisting solely of personal matters discussed between those using a Clinton address and this one person. The complete upper bound on the number is 35,000, if you assume that every personal email in the entire organization was a communication with this one person. In reality, it's a tiny fraction of that, because there's only so much banter one person is capable of having with the entire state department.
So by step C, there should only be a few hundred emails at most. Of these, software run during step B makes it blatantly obvious if there are modified emails, and by process of elimination, any new information should be strictly personal in nature. Hand them out to a decent sized room to sift through, and you could have the whole thing wrapped up in a day or two of reading and cross referencing.
This process is not something they've confirmed publicly, to my knowledge, but as a programmer, that's the logical way in which processing this would proceed, and it would be done in under a week if the boss's neck was on the line.