“The Largest Database Ever Assembled In The World”

John Aravosis wants to know how a database of everybody’s telephone records can possibly beat every other database in the world in terms of size. The simple answer is that it can’t. A modern mainframe could handle a database like that with room to spare. [Update – I should point out that this is an off-the-cuff observation. Better-informed individuals could easily prove me wrong.] John thinks that it would get a lot bigger if you started throwing audio files (e.g., the actual phone calls) into the mix, but I think that he’s barking up the wrong tree. The phone companies might hand over names and numbers but there’s no way that they would record and turn over every single conversation. Besides being obviously, screamingly illegal it demands an expensive infrastructure upgrade that somebody would have to pay for.

When you think about what the government might use to flesh out a phone-records database something else comes to mind. Imagine sifting through a massive database looking for questionable activity. What sort of cross-referencing would you find particularly useful? FBI files, for one. Since 9/11 most of us have one and those of us who already had one saw it get a lot fatter. Both databases contain a phone number so matching one to the other would be a cinch. As long as I’m compiling I would probably throw in the TALON database as well. Why not tax records? It’s illegal as hell of course, but on the other hand people are out there plotting to kill us all. It shows how far we have slid as a country that the idea doesn’t even sound that crazy.

Honestly I have no idea what you need to throw in to crack the top ten databases in the world, but John’s right that it has to be more than phone numbers, times and dates. Doubtless we’ll know more soon.






95 replies
  1. 1
    Richard Bottoms says:

    Are the dumb bastards who voted for El Presidente regretting it now?? Damn that Ted Rall for being so negative about such a great man.

  2. 2
    Perry Como says:

    I think it would depend on how you define “large”. Is it the physical size of the database or is it the number of records? I’d imagine that tracking the times and dates of every phone call that goes through every domestic carrier (except the terrorist loving QWest) would generate alot of records.

    Luckily Diebold isn’t behind the technology. I think Access would choke after the first millisecond.

  3. 3
    Keith says:

    The biggest database I’ve ever heard of (and it was multiple terabytes in size 5 or so years ago) was owned by a phone company (and I can’t remember the company name). Hard to beat multiterabyte DBs by “a long shot”. The biggest database I can think off the top of my head is the *set* of databases that comprise Google, as it contains caches of most of the internet (Way Back Machine may be larger just considering the cached content) in addition to Usenet, Google Video (ENORMOUS spacehog, *if* it’s stored in the DB as a blob rather than externally as a file), etc.
    Looking at just plain text data, though, I could see the phone data exceeding most other DBs, as there are millions of phone calls starting and stopping every minute, with usage patterns evening out across timezones. While Google is storing a cache of, let’s say, today’s political blog entries, SkyNet (or whatever the real name of this thing is) is recording millions of much smaller but more numerous phone call records.
    Shouldn’t be *too* hard to estimate how much a basic phone call DB could store, as you’ve got a table with 3 fields – source # (varchar of 12 or so characters, assuming they want to store the entire phone number, not just the digits), destination #, and time of call (and datetime is a 64-bit value). Also, factor in that those values will all (even the datetime) would probably be indexed, resulting in *more* data for every record (for those who don’t know, indexing a database makes searching for records by things like name or date MUCH faster than brute force). Also, depending on the database type (Oracle, Microsoft, etc.) , it could handle memory pages in any number of ways, which can result in some additional size tacked on.

  4. 4
    Andrei says:

    Imagine sifting through a massive database looking for questionable activity.

    Why not just order a criminal background check on George W. Bush? Only cost you $49.

  5. 5
    Gratefulcub says:

    It’s all so confusing…..

    i like Hookergate. simple, sexy, and scandalous.

    You can tap my phone, just don’t get a blow job

  6. 6
  7. 7
    Keith says:

    I left out a couple of important things: each record would also need a unique ID, which, given this DB’s size, would need to be 64 bits.
    The other thing is that there is a big difference between a database’s capacity and the amount of data it’s actually storing. For instance, the multi-terabyte DB I mentioned previously did not have multiple terabytes of data in there; it was a new system intended to allow data to grow into the capacity. So really, the underlying data files of a database can be 10s of gigs, whereas the server can let that DB grow up 2 terabytes. So you then ask yourself, is this a 30GB database or a 2TB one?

  8. 8
    Steve says:

    Why not just order a criminal background check on George W. Bush?

    I’ve read, although I don’t know if it’s true, that Bush is the first man to be elected President after being convicted of a crime.

  9. 9
    Jill says:

    I’d like to know why those on the right fight very hard to have gun sales records purged ASAP but think it’s fine and dandy to allow data mining of our telephone records?

  10. 10
    Andrei says:

    Bah… the way that site handles form queries is shot. Just enter the name of anyone you like into the form at the bottom then search. Be sure to search on yourself. I was shocked to discover my addressese even as far back as the 80s were being recorded and saved in such a public database.

    And yes, I’m going to buy a criminal background on George W. Bush from Midland, TX just to see what I gfet back in the mail.

  11. 11
    terry chay says:

    Keith,

    Minor point. There is a forth piece of information, the duration of the call. I only mention that because it is possibly more useful than the date time of the call.

    This information would also be gathered into a OLAP so you would double the size of the database right there (one for the OLTP and one for the OLAP to do data mining).

    Still it’s trivial. A small internet startup could purchase that amount of computing resources and the NSA purchases more computing resources than all other government TLA agencies combined.

    I think Tim is on the right track. You don’t need the actual audio files or the content mined from them, the usefulness of the call data in building a network profile would cause the network to be very useful, very quickly.

    Since my trackbacks have broken, here is my take.

  12. 12
    terry chay says:

    s/forth/fourth/

  13. 13
    srv says:

    These are the scales for call record info:

    AT&T Daytona/Hawkeye Database

    The DaytonaTM data management system is used by AT&T to solve a wide spectrum of data management problems. For example, Daytona is managing over 312 terabytes of data in a 7×24 production data warehouse whose largest table contains over 743 billion records as of Sept 2005. Indeed, for this database, Daytona is managing over 1.924 trillion records; it could easily manage more but we ran out of data.

    This is nothing compared to NSA/NRO. They probably do multi-terabytes/day easily.

  14. 14
    Steve says:

    That’s a very, very good observation, Jill.

  15. 15
    Bone-In RibEye says:

    Imagine sifting through a massive database looking for questionable activity.

    Only an idiot would build a database of this size without concern for data retrieval. How the data will be used is one of the first questions to ask when constructing a database. You don’t just collect and store for a rainy day.

  16. 16
    Pb says:

    Perry Como,

    As far as how you define ‘larger’, I’d say that this database is larger in scope, in that more people are being tracked. As for physical size or number of records, I don’t expect journalists to get that right, even if they did find out.

  17. 17
    srv says:

    I’d like to know why those on the right fight very hard to have gun sales records purged ASAP but think it’s fine and dandy to allow data mining of our telephone records?

    Well, it’s OK to want to be able to shoot somebody anonymously, but not plan anything bigger than that.

    You should see what the gun freaks think of having each bullet built with an RFID tag. Then each bullet is tagged. Heaven forbid.

  18. 18
    tzs says:

    Hmmm. I offer up another candidate for a large database: the stuff that comes down from satellites. Tons and tons of data, especially from weather satellites.

  19. 19
    Mark says:

    I’d like to know why those on the right fight very hard to have gun sales records purged ASAP but think it’s fine and dandy to allow data mining of our telephone records?

    While we’re over-generalizing, why do those on the left defend the prerogative of mothers to kill their infact children, but then get upset about infant mortality rates?

    My theory is because people are stupid.

  20. 20
    Pb says:

    srv,

    Cue Chris Rock. (“You don’t need no gun control. You know what you need? We need some bullet control. Man, we need to control the bullets, that’s right.”)

  21. 21
    ppGaz says:

    I’d like to know why those on the right fight very hard to have gun sales records purged ASAP but think it’s fine and dandy to allow data mining of our telephone records?

    Duh! Can you shoot somebody with a telephone? I don’t think so.

    Can you imagine Charlton Heston holding up a cell phone and saying “Not until you pry it from my cold, dead fingers!” I don’t think so.

  22. 22
    srv says:

    Oh, and as far as DB size. Think about this. As long as your cell phone is on, there is location information (so the phone company knows where to ring you). Either the closest cell towers location or if you have a GPS chip, your GPS signal.

    What if somebody is actually archiving that signal? Here’s a simple DB query for you:

    A) Protest Date/Time
    B) Protest Coordinates
    C) Radius from #2

    A+B+C = all the active cell phones at the protest

    Plug that into the Hawkeye DB, and you have the names and addresses of everyone near the protest.

  23. 23
    srv says:

    Duh! Can you shoot somebody with a telephone? I don’t think so.

    Hey, there’s an idea. The NRA would care if we put cell phones in guns.

  24. 24
    Perry Como says:

    I offer up another candidate for a large database: the stuff that comes down from satellites.

    I’d offer up a database of Republican excuses as to why they don’t suck. Powerline alone could generate terabytes of drivel.

  25. 25
    demimondian says:

    The really big databases are actually at places like ACCXIOM (big ‘direct mail’ data warehouse), as well as in the credit bureaus — we’re talking multiple tens of terabytes with hundreds of gigarecords in hundreds of tables.

    Of course, if you think about it, those databases are legally for sale — and they’re likely also included in the putative phone call database. As a result, it’s entirely believable to me that the BushSnooperDooperDatabase could be largest in the world, even without the audio.

  26. 26
    ppGaz says:

    Hey, there’s an idea. The NRA would care if we put cell phones in guns.

    That is an idea!

    Headline: “Huge Government Database of Gunphone Users!”

    Watch the monkeys rub their bottoms on the ground trying to figure out how to respond to THAT one.

  27. 27
    neil says:

    A friend and I thunk this one out earlier today. The database, if reasonably designed, after 5 years would only be in the hundreds of terabytes — well within the budget of a small university. Of course it doesn’t exist in a vacuum, and I’m sure they have much greater capabilities that they’re not hinting at, but in itself it’s not so big.

  28. 28
    Brian says:

    That’s a very, very good observation, Jill.

    Yes, it is indeed very good…..FOR ME TO POOP ON!!

    The histrionics of the Angry Left is so fun to observe. Tim, you should sell tickets to this site.

    This data is information that can be purchased for the right price from these phone companies. Your data’s already out there. That horse left the barn with the Internet age, and there’s no getting him back in. Besides, it’s not the content of the calls, but, if you will, the wrapper of the content that’s in question. It’s information that can be quickly analyzed, whereas content takes much longer. If there’s a link, or something worth investigating, then it’s probably worth investigating further by accessing the content.

    Bush may have low poll numbers, and for reasons that I can understand. But this is one issue whre he’ll have backing from the public, despite the fulminating of butt whiffers. You’re barking up the wrong tree, but it’s a helluva show.

  29. 29
    Brian says:

    That is an idea!

    Don’t think. It weakens the team.

  30. 30
    HyperIon says:

    Tim wrote:

    FBI files, for one. Since 9/11 most of us have one and those of us who already had one saw it get a lot fatter. Both databases contain a phone number so matching one to the other would be a cinch.

    I seem to recall that the FBI has some serious deficiencies in the area of computers.

    So maybe it wouldn’t really be a cinch for them.

  31. 31
    Jackmormon says:

    Tim F.: FBI files, for one. Since 9/11 most of us have one and those of us who already had one saw it get a lot fatter.

    Wait, what?

    Yes, I mutter darkly about what’s in my FBI file, but I haven’t actually gotten to assuming there is one.

  32. 32
    ppGaz says:

    The histrionics of the Angry Left is so fun to observe. Tim, you should sell tickets to this site.

    That’s right, spoofasshole Brian.

    “A Gallup poll recorded a 13-point drop in Republican support for Bush in the past couple of weeks.”

    Is this where Don Meredith sings “The party’s over?”

  33. 33
    ppGaz says:

    “He’s not well liked,” said Douglas Giles, 47, a self-described conservative from Buffalo. “A lot of people don’t think he’s very good.”

    Ah, the histrionics of the Angry Right. You ought to sell tickets, Tim.

  34. 34
    ppGaz says:

    “The problem in my mind, and the only way to explain the very significant erosion is just a disgust with what appears to be a complete abandonment of limited government,” said former Republican congressman Patrick J. Toomey, who runs the conservative Club for Growth. Toomey said commitment to smaller government has been the unifying idea for most elements of the GOP coalition since Ronald Reagan’s presidency. “Republicans have finally had enough,” he said, a sentiment echoed by several other conservative activists and lawmakers.

    Since Bush took office, government spending has increased by more than 25 percent, the largest increase under any president since Democrat Lyndon B. Johnson.

    Is that a wheel coming off?

  35. 35
    LITBMueller says:

    heh. Bet Brian would actually be upset if the FBI was tracking all domestic internet traffic to see what porn sites people are visiting, and using that info for data mining to potentially track down evildoers!

    Seriously, though: how much you all want to bet that NSA hasn’t just limited their “dragnet” to surveiling international phone call content and basic call data for all domestic phone calls? If they were as gung ho as they were in 2001 to establish these programs, then they would not have overlooked the internet and email.

  36. 36
    ppGaz says:

    But wait! Is that the sound of trumpets sounding the charge?

    Karl Rove, Bush’s top political adviser, and GOP leaders are well aware of the problem and planning a summer offensive to win back conservatives with a mix of policy fights and warnings of how a Democratic Congress would govern. The plan includes votes on tax cuts, a constitutional amendment outlawing same-sex marriage, new abortion restrictions, and measures to restrain government spending.

    Bwaaaaaaaaaaaaaahahahahahahaha! Measures to restrain — whaaaaahahahahahahaha — government spending!

    Oh sweet Jesus, I can’t breathe! Hahahahahahahahahaha!

    Outlawing same-sex marriage!

    Oh.My.Fucking.God.

  37. 37
    LITBMueller says:

    The plan includes votes on tax cuts, a constitutional amendment outlawing same-sex marriage, new abortion restrictions, and measures to restrain government spending.

    OH! You mean, all that stuff the GOP keeps promising the base, but never ever does???? LOL!!! Someday….someday…the base is really gonna figure out they’ve been conned.

  38. 38
    Bone-In RibEye says:

    They’ll say it again and it’ll work. You don’t really think we’re voting the repubs out of the majority do you? Not when the people can be blinded with wedge issues.

  39. 39
    Steve says:

    If your constituency consisted of people like Brian you’d pretty much feel like you could get away with whatever the fuck you wanted, too.

  40. 40
    Punchy says:

    What exactly does one do with…billions?…of phone call records? How does one logically mine this? I see this akin to bioinformatics…mining for genes…but what does one conclude about terrorism after learning that Joe Pharmacist called his grandma twice on Monday and Billy’s in 3 large to his college bookie and calls him 5 minutes before every Padres games? Unless, of course, Jake Peavy is about to throw a baseball into an airliner, crashing into the Pacific….

  41. 41
    ppGaz says:

    I’m actually feeling kinda sorry for Bush this moment. These right-wingers are a whiny, ungrateful bunch. After all, he’s declared war on science for God’s sake. What more do the want from the guy?

    Alterman, today. Great line, eh?

    Emphasis God’s.

  42. 42
    Perry Como says:

    You don’t really think we’re voting the repubs out of the majority do you? Not when the people can be blinded with wedge issues.

    And John Kerry wind surfs.

  43. 43
    Bone-In RibEye says:

    And John Kerry wind surfs.

    Exactly. The issues mean nothing. So many people are so set against voting demcorat that it doesn’t take much to keep them on the republican side. All the work was done years ago painting the dems as the reason for all societies ills. Now its simply a matter of reinforcement.

  44. 44
    Otto Man says:

    You can tap my phone, just don’t get a blow job

    I think you mean, “You can tap my phone, but don’t tap that ass.”

  45. 45
    LITBMueller says:

    The issues mean nothing. So many people are so set against voting demcorat that it doesn’t take much to keep them on the republican side. All the work was done years ago painting the dems as the reason for all societies ills. Now its simply a matter of reinforcement.

    Like I said. Conned.

  46. 46
    Andrew says:

    I think you mean, “You can tap my phone, but don’t tap that ass.”

    That’s between you and James Dobson’s wife.

  47. 47
    Bone-In RibEye says:

    Like I said. Conned.

    I think they’d rather continue to be conned than vote democrat.

  48. 48
    Brian says:

    “A Gallup poll recorded a 13-point drop in Republican support for Bush in the past couple of weeks.”

    And those numbers include people like me, who happen to think he’s failing us more with each passing week, if you can imagine that. But it’s not over this faux scandal. It’s more likely over illegal immigration and gas prices. This NSA thing is your little sandbox toy.

  49. 49
    Andrew says:

    I think they’d rather continue to be conned than vote democrat.

    What’s a little wiretapping compared to being forced to marry your gay neighbor’s gay dog?

  50. 50
    Bone-In RibEye says:

    people like me, who happen to think he’s failing us more with each passing week, if you can imagine that. But it’s not over this faux scandal. It’s more likely over illegal immigration and gas prices.

    Those two items will become the reaosn you and a lot of people vote republican. You say those are the hot issues for you, so do a lot of people. By midsummer most people will still be upset at those issues but they will have been told who is to blame for those issues, who’s preventing the issue from being resolved. The blame lies in them not sharing your values or because they have no ideas. They want to take your hard earned money and spend it educating the illegals instead of rounding them up and deporting them. They’re breaking our laws just being here but my opponent wants to give them hugs.

  51. 51
    LITBMueller says:

    It’s more likely over illegal immigration and gas prices. This NSA thing is your little sandbox toy.

    Does that mean we’ve won the GWOT and can move on to another issue with which to divide America? Why not combine them all into one super issue? Gay Mexican Illegal Immigrant Abortion Doctors.

  52. 52
    Perry Como says:

    Gay Mexican Islamic Illegal Immigrant Abortion Doctors Judges.

    hth
    hand

  53. 53
    Fledermaus says:

    Luckily Diebold isn’t behind the technology. I think Access would choke after the first millisecond.

    Plus good luck trying to get a printout of the data.

  54. 54
    demimondian says:

    But it’s not over this faux scandal. It’s more likely over illegal immigration and gas prices. This NSA thing is your little sandbox toy.

    Like the man said, “Conned”.

  55. 55
    ppGaz says:

    those numbers include people like me

    People like you?

    BWAAAAAAAAAAAHAHAHAHAHAHAHAHAHAHAHHAHAHAHAHAHA.

    Oh sweet mother of God, where is my inhaler ……

    You …. and Pinocchio?

  56. 56
    Steve says:

    Wait, Brian blames Bush for gas prices, but not any of this other stuff? Cue Dale Carpenter:

    Judge Posner’s analysis highlights an irony in the current political atmosphere. On the one hand, the Bush administration has suffered relatively little in public opinion for a number of problems over which it has substantial control: indefinite detentions of American citizens without charge or trial, the abuse of prisoners, warrantless domestic surveillance, and the extraordinary growth in the federal deficit, to name a few. On the other hand, the Bush administration has suffered grievously in public opinion for rising gas prices, a trend it cannot and — if Judge Posner is right — should not reverse.

    It’s positively comical. It’s like Fred Phelps hating Bush for not being anti-gay enough. Oh well, live by the base, die by the base!

  57. 57
    Sojourner says:

    Karl Rove, Bush’s top political adviser, and GOP leaders are well aware of the problem and planning a summer offensive to win back conservatives with a mix of policy fights and warnings of how a Democratic Congress would govern. The plan includes votes on tax cuts, a constitutional amendment outlawing same-sex marriage, new abortion restrictions, and measures to restrain government spending.

    I guess we’ll see just how stupid the American people are.

  58. 58

    Every phone number, along with every phone call they’ve made. It’d be pretty big.

    But the phone companies already keep this stuff in their databases, along with billing records. And then, yeah you’ve got the tax records. Actually I still think the biggest databases are probably either maintained by Mastercard/Visa or the credit reporting agencies.

    You know, Barnes & Noble maintains a OLAP database of all their transactions, so they can cut and parse and figure out what people are buying each day in different parts of the country. I read this in some whitepaper from microsoft. I think they bring in 3 million sales transactions per day into this database. Now B&N is pretty small. I wouldn’t doubt that Wal-Mart does something similar with their sales records.

    So yeah, I don’t understand the big claim and where that came from. Biggest database of telephone records, surely, because it’s a number of companies merged.

    Hey, you know. I wonder if they need some help building an application to query that database. How do I get that job? Now that Duke’s gone, which congressman do I have to bribe?

  59. 59

    And those numbers include people like me, who happen to think he’s failing us more with each passing week, if you can imagine that. But it’s not over this faux scandal. It’s more likely over illegal immigration and gas prices.

    You’d probably blame Bush for Chris Daughtry being booted from American Idol too!

  60. 60
    tBone says:

    Now that Duke’s gone, which congressman do I have to bribe?

    Get yourself some hos and limos; the rest will take care of itself.

  61. 61
    terry chay says:

    @OtherSteve:

    It’d certainly be fun to query that OLAP, especially when you match records with other databases. I mean if you can get over the moral issue of what you are doing.

    I think it may be the biggest database, not in terms of number of records, but in terms of the likelihood that a person’s social network can be mined from it. Given that it is the combined call records of nearly every phone in the country. It is highly unlikely that there is a person working in America who doesn’t make some statistical contribution to that database. I don’t think even Walmart’s vaunted data warehouse can claim that.

    Plus, you can’t mine the connection information (who knows who) from B&N or Walmart’s database. That’s where the rubber meets the road with this database.

    This is why, even though Total Information Awareness is actually an interesting academic idea, it was destroyed by both sides of the aisle and ended it for John Poindexter—something even the Iran-Contra wasn’t able to do.

    Take a look at what happened with TIA and you can see why this issue has political legs.

  62. 62
    Perry Como says:

    Speaking of databases, I once toured the Go Network’s (Disney) data center in Seattle. In one part of the server room they had a bunch of massive db servers lined up on the wall. Across the floor and under the servers were a bunch of i-beams. Turned out the building couldn’t support the weight of the boxen without…

    wait for it…

    load balancing.

  63. 63
    Perry Como says:

    This is why, even though Total Information Awareness is actually an interesting academic idea, it was destroyed by both sides of the aisle and ended it for John Poindexter—something even the Iran-Contra wasn’t able to do.

    TIA wasn’t destroyed. It was just moved out of the limelight. Alot of the work is being done at various universities and some “private” companies.

  64. 64
  65. 65
    terry chay says:

    @Perry Como: I mean “destroyed” as in having its funding legs pulled out from under it in a bipartisan manner (I’d emphasize post-9/11 bipartisan) and being one of the two things that damaged John Poindexter’s reputation (the other being the so-called “terrorist futures market”) to the breaking point.

    It may be true that data-mining research is still being done at the university and think tank level. I’d expect nothing less—ideas like data-mining information is exactly the thing DARPA is supposed to be there for. Hell, one could say that IEM is a sort of research-level version of the “terrorist futures market,” but without the data, it ain’t TIA, not by a long shot.

    But, what is going on now is simply saying, “Let’s screw the research nature, and the cost/benefit, and just implement it.”

  66. 66
    terry chay says:

    BTW, the linked article implies that most of the TIA research is being done by government agencies, not just by universities:

    Notwithstanding the defunding of TIA and the closing of the IAO, several TIA projects continued to be funded under the classified annexes to the Defense and the Intelligence appropriation bills in 2003 and subsequently.

    For example, several TIA projects were funded through the National Foreign Intelligence Program for foreign counterterrorism intelligence purposes by the National Security Agency as Advanced Research and Development Activity (ARDA) under the classified annex to the 2004 DOD Appropriations Act as contemplated in §8131 thereof. Recent reports suggest that some of this activity is now part of the Disruptive Technology Office (DTO) reporting to the Director of National Intelligence.[1]

    I didn’t mean to imply “destroyed” as in “it doesn’t exist,” I meant to imply that it was “politically destroyed.” Your point is very valid and I should have qualified my term.

  67. 67
    scalefree says:

    You don’t have to speculate about the size of at least the AT&T portion of the database. Here’s two links from AT&T itself that describe it. You can extrapolate from there about Verizon & Sprint, then add a fudge factor for additional space required to incorporate data from other databases & also the associational links between records that are the point of the system.

    http://www.research.att.com/~daytona/
    http://www.research.att.com/~daytona/inuse.php

  68. 68
    demimondian says:

    I wonder if they need some help building an application to query that database. How do I get that job?

    Hey, with that attitude towards evil, I can think of three perfect companies for you to think about.

  69. 69
    scalefree says:

    For reference, here is Winter Corp’s 2005 Top 10 Databases list:

    http://www.wintercorp.com/VLDB.....s_2005.asp

    It all depends on how you measure it, but AT&T’s Daytona is clearly #1 or #3 all by itself. Add in MCI, Verizon, additional fields for customer name/address etc. & the associational links & there’s no question the NSA database is the biggest in the world.

  70. 70
  71. 71

    Hey, with that attitude towards evil, I can think of three perfect companies for you to think about.

    google and MS only give you free soft drinks. You don’t get limos and hos like the coosh government contractors.

  72. 72
    demimondian says:

    google and MS only give you free soft drinks. You don’t get limos and hos like the coosh government contractors

    Yeah, but with the right to party like its 1999 with your favorite authoritarian dictator of the week, why would you care about the piker perks that come to bureaucrats? Dictators can spread the ho-ho-ho without fear of an ethics inquiry, after all.

  73. 73
    Evilbeard says:

    And here I was hoping to read a big explanation of why this is ok from Darrel.

  74. 74

    Bush may have low poll numbers, and for reasons that I can understand. But this is one issue whre he’ll have backing from the public, despite the fulminating of butt whiffers. You’re barking up the wrong tree, but it’s a helluva show.

    LOL. Brian you sure are a riot.

    Too bad you’re not trying to be funny.

  75. 75
    Andrew says:

    It all depends on how you measure it, but AT&T’s Daytona is clearly #1 or #3 all by itself. Add in MCI, Verizon, additional fields for customer name/address etc. & the associational links & there’s no question the NSA database is the biggest in the world.

    There are plenty of questions. That is a list of commercial databases. There are dozens of much larger databases within any number of US gubmit agencies, including NOAA, NASA, NGA, etc. to say nothing of the Library of Congress, which is putting everything online.

    NSA itself probably has much larger databases from various programs, including Echelon.

  76. 76
    Marcus Wellby says:

    “It’s the largest database ever assembled in the world,” said one person, who, like the others who agreed to talk about the NSA’s activities, declined to be identified by name or affiliation.

    Is this a factual statement, or more like a “gee wow” kind of thing coming from someone who does not work with large databases?

  77. 77

    Yeah, but with the right to party like its 1999 with your favorite authoritarian dictator of the week, why would you care about the piker perks that come to bureaucrats? Dictators can spread the ho-ho-ho without fear of an ethics inquiry, after all.

    Yeah, but Dictators also have guns. You ever watch Lord of War?

    I’ll take the hos and limos in DC, thank you!

  78. 78
    TBone (Big says:

    I keep hearing some knuckleheads fretting about the gubmint tapping phones.

    What’s a little wiretapping compared to being forced to marry your gay neighbor’s gay dog?

    Let me make this clear: NO ONE IS TAPPING YOUR PHONE! (Caveat: if you are engaged in terrorism and are associated with terrorists, then perhaps your phone is being monitored; and the prior statement doesn’t apply) So unless you are a terrorist, you need not wring hands and gnash teeth. I always laugh when people jump to conclusions about things they NO NOTHING ABOUT. They are harbingers of doom, yet are so totally ignorant about intelligence/law enforcement techniques. And instead of researching and trying to understand, they just make shit up and start running their mouths in protest.

    Also Tim said:

    As long as I’m compiling I would probably throw in the TALON database as well.

    Tim probably doesn’t realize the Talon DB contains almost NO PHONE NUMBERS, but instead is simply a record of “suspicious” incidents on/or around military facilities that have been reported by citizens. For example, if Joe Citizen sees a person in a parked car taking photographs of Sensitive “Military Installation X” (which is sometimes illegal BTW) then that report will go into the Talon database. If that same person comes back at a later date, then perhaps the previous incident will give the investigator an indication that something might be wrong.

    The Talon database is nothing nefarious (as advertised by ignorant teeth gnashers) and is simply smart (and legal) investigative activity.

  79. 79
    demimondian says:

    TBone, Tim may not know anything about military intelligence, but I can say with a fair amount of confidence that you don’t know much about relational databases. Why? This remark:

    Talon DB contains almost NO PHONE NUMBERS, but instead is simply a record of “suspicious” incidents on/or around military facilities that have been reported by citizens.

    One basic operation in an (normal) database query is a join, in which data in two tables is used to produce a temporary record set merged on items based on a Boolean and/or functional relationship between those tables. (And, yes, TOC, etc., I know about stored procedures and other optimizations — let’s not go there, OK? It just complicates the story.) The key thing about a join is that two tables need not share a common key field, or, indeed, any common field in order to be joined.

    Shorter demimondian: what Tim suggests is entirely possible.

  80. 80
    Steve says:

    Atrios was right: these guys really DO get top-secret briefings on exactly how our surveillance capabilities work every time a story breaks. It’s amazing how they can assure you, beyond a shadow of a doubt, that your phone is not being tapped, or what the contents are of an NSA database.

  81. 81
    Darrell says:

    Steve Says:

    Atrios was right: these guys really DO get top-secret briefings on exactly how our surveillance capabilities work every time a story breaks. It’s amazing how they can assure you, beyond a shadow of a doubt, that your phone is not being tapped, or what the contents are of an NSA database

    In such cases where specifics are not known for sure, it’s usually best to start conclusion jumping, screaming ignorant accusations about how Bush is “illegally” wiretapping our phones.. It’s called, speaking truth to power, and it feels righteous

  82. 82
    Darrell says:

    Jill Says:

    I’d like to know why those on the right fight very hard to have gun sales records purged ASAP but think it’s fine and dandy to allow data mining of our telephone records?

    Not entirely unfair point, although your analogy would be more valid if the data mining was listening in to the conversation.. just as gun sales receipts would reveal the details of the transaction. As I understand it, there is recent history, in Australia and I think UK too, in which the govt used gun receipts to track and confiscate guns held by law abiding owners when gun laws changed and were subsequently outlawed. So maybe they have good reason to worry here too, given what they’ve seen happen in other similar countries

  83. 83
    Perry Como says:

    So maybe they have good reason to worry here too, given what they’ve seen happen in other similar countries

    It happened in the US too, after Katrina. The NRA had to get an injunction from a judge to prevent LEOs in New Orleans from confiscating registered guns.

    I’m genuinely confused when “conservatives” roll over when programs like the NSA keeping a massive database of every call Americans have made comes up. Are “conservatives” really that scared of terrorists?

  84. 84
    Steve says:

    In such cases where specifics are not known for sure, it’s usually best to start conclusion jumping, screaming ignorant accusations about how Bush is “illegally” wiretapping our phones.. It’s called, speaking truth to power, and it feels righteous

    The flip side of that is assuming that nothing is going on beyond what the government chooses to tell us, because the government has never abused its spying powers in the past (right?), refusing to support any factual inquiries that might give us more information about whether abuses are occurring, and quoting any wild legal theory you happen to run across that says hey, even if the law is being violated, rest assured the President has the authority to break the law. Does sticking your head in the sand really feel better than what you just described?

    Democrats have proposed such unreasonable measures to deal with the NSA wiretapping program as, say, submitting the program to the FISA court to get their opinion on whether it complies with the law. Now, only a screaming moonbat would support that kind of thing, right? Wouldn’t want to have someone with the appropriate security clearances actually find out the specifics and tell us if it’s legal, because it’s much better to go around repeating “no one knows the specifics, so you have no basis to complain!”

  85. 85
    Darrell says:

    I’m genuinely confused when “conservatives” roll over when programs like the NSA keeping a massive database of every call Americans have made comes up

    Yes, just like we ‘rolled over’ in carrying Social security cards and passports to travel overseas, when the government fascists told us we had to.

  86. 86
    Perry Como says:

    Yes, just like we ‘rolled over’ in carrying Social security cards and passports to travel overseas, when the government fascists told us we had to.

    I think SSNs have been massively abused and people warned about it when they were implemented. But I still don’t understand how people who claim to support small government can shrug their shoulders when the government increases massively and wants to record everything you do. Ripe for abuse is an understatement.

  87. 87
    scalefree says:

    There are plenty of questions. That is a list of commercial databases. There are dozens of much larger databases within any number of US gubmit agencies, including NOAA, NASA, NGA, etc. to say nothing of the Library of Congress, which is putting everything online.

    You didn’t look at the page closely enough. There are 3 widgets across the page: Metric, Platform & Usage. Click the Usage widget & choose the other 2 options to see databases in the scientific & online transaction categories. “Usage” really should be labeled “Category”. Anyway, those other lists include the databases you’re talking about & Daytona’s still in the top 3 by itself.

  88. 88
    Steve says:

    When conservatives say they want “small government,” what they really mean is that they want lower taxes. It’s the libertarians who care about Big Brother and such. The gulf between these two groups gets wider every day.

  89. 89
    Darrell says:

    But I still don’t understand how people who claim to support small government can shrug their shoulders when the government increases massively and wants to record everything you do.

    You’d be hard pressed to find a conservative who doesn’t feel betrayed by Bush’s big govt, big spending ways. Which btw, was supported by a big spending Republican dominated congress. No argument there.

    As for the ‘recording everything’ we do part, I don’t see that happening. From what I’ve read about the NSA programs, I don’t see anything particulary unreasonable about them. If they start recording my conversations to aunt Edith without warrant, then I’ll change my position

  90. 90
    Scott says:

    OK I’m no expert on databases and I don’t know for sure whatz in this one but lets look at what they have access to and what is rumored to be in there.

    If you have telephone records of who called who and then who they call and how long they talked and maybe you know what was on the news when and you knew the names and phone numbers of some Democratic activists, seems to me you could follow the phone trees and if you had a friend somewhere you could tap a line in an out of the way place and probably do a lot of little things that could throw elections. Also figure they have everyone tax records in a database they could merge, Also the DOD database of young males they use for recruiting but never purge you can merge that one in or run querys from the database in series or some resourceful data base engineer could probably tell alot just from the call trees that are generated especially political call trees they may find people giving advice or passing information. This may have been one long run on sentence but that doesn’t change the fact that if you don’t trust the current adminstation and with no oversite being done you should be very scared

    Scott (a 20 year vet, and a democrat)

  91. 91
    Andrew says:

    You didn’t look at the page closely enough. There are 3 widgets across the page: Metric, Platform & Usage. Click the Usage widget & choose the other 2 options to see databases in the scientific & online transaction categories. “Usage” really should be labeled “Category”. Anyway, those other lists include the databases you’re talking about & Daytona’s still in the top 3 by itself.

    Those are public government databases that are one to two orders of magnitude smaller than individual datasets (let alone databases) that I’ve seen from various government sources. 23K GB? Talk to me when you’re in the petabyte range and we might take you seriously.

  92. 92
    Al Maviva says:

    How does one logically mine this

    Start with a list of a couple thousand overseas phone numbers of known terrorists – nothing precludes collecting that stuff except for hysteria about echelon and tactical collection. Collect the numbers they dial on incoming calls to the U.S., or via Europe where collecting content is still legal and you can still know whether the content is terror related. Collect those U.S. numbers dialed to from the known or very likely terrorist numbers they dialed, but just at the wrapper level, not content, no personal identifying information other than the number – that’s where you would start. Let’s call them the “T” set of phone numbers.

    Now take all the domestic anonymized phone numbers with three data items, originating number, destination number, and time of call. “555-1212 called 121-2555”, “121-2555 called 551-2125,” etc.

    Match up the domestic call database, anonymized, to phone calls too or from the T subset. Build a big social network diagram detailing all calls into the T phone numbers, and looking at the incoming/outgoing numbers – follow that web of numbers out two or three degrees of separation from the T subset, or even further. Pay special attention to where two T subset phone numbers come into contact with each other directly, or through several degrees of separation. The T numbers are suspect; other numbers linking two T numbers through a fairly short chain could yield leads. For instance, if – T1 called A, A called B, B called C, and C called T2 and T3, you might want to take a closer look at that social web.

    Pooh and I discussed this quite a while back. I’m not sure if that is how it works, or if it would be legal or advisable, but I’ve seen comparable things done with fairly simple social network diagrams, so I’d be surprised if it’s beyond gubmint capabilities.

  93. 93
    Harry says:

    The infrastructure is already in place for most of the recording. Check out the requirements of CALEA.

  94. 94

    […] From a comment by a Republican reader of this blog: […]

  95. 95

    demimondian said:

    TBone, Tim may not know anything about military intelligence, but I can say with a fair amount of confidence that you don’t know much about relational databases.

    Demi,

    I may not write databases, but I use them quite often. One in particular is called “Analyst Notebook”, another is “Crime Link”. The underlying database portion of the programs can be either the proprietary DB or Microsoft Access. When one builds those particular databases, he can choose the fields he desires, and search within them. Phone numbers may be one of the fields utilized, but not always. Time/event charts can be generated, as well as an association matrix.

    Therefore, what I said regarding the TALON DB is accurate. The TALON has nothing to do with NSA or phone numbers. If a report coming out of the field is derived from a concerned citizen’s observation of “suspicious” activities, there is NO PHONE NUMBER that will be reported; especially if the “suspicious person” wasn’t arrested for a crime subsequent to the report, and no number was obtained by the police.

    I appreciate you coming to Tim’s defense, but please don’t pop off with some bullshit just to be contrary.

Trackbacks & Pingbacks

  1. […] From a comment by a Republican reader of this blog: […]

Comments are closed.