Social security numbers of individuals can be predicted by using the publicly available information from governmental sources, commercial data bases or online social networks, according to the researchers at Carnegie Mellon University.
Alessandro Acquisti, an associate professor of information technology and public policy, and Ralph Gross, a post-doctoral researcher, have found that an individual’s date and state of birth are sufficient to guess his or her Social Security number. The study findings were published in the Proceedings of the National Academy of Science.
The predictability of Social Security numbers is an unexpected consequence of seemingly unrelated policies and technological developments that, in combination, make Social Security numbers obsolete for authentication purposes, according to Acquisti and Gross.
Acquisti said: “In a world of wired consumers, it is possible to combine information from multiple sources to infer data that is more personal and sensitive than any single piece of original information alone.
The researchers used the Social Security administration’s Death Master File, a public database with Social Security numbers, dates of birth and death, and states of birth for every deceased beneficiary. But Acquisti and Gross opined that the impostors could detect statistical patterns from the file which could help them to predict Social Security numbers of the people alive.
From the researchers’ sample, it was possible to identify in a single attempt the first five digits for 44% of deceased individuals who were born after 1988 and for 7% of those born between 1973 and 1988. They were able to identify all nine digits for 8.5% of those individuals born after 1988 in fewer than 1,000 attempts.
The report claims that the accuracy of the prediction system was considerably higher for smaller states. The accuracy was even higher for people born after 1989, as the Social Security numbers are assigned shortly after birth.
The results were almost good even when the researchers tested their method using birth dates and hometowns that students had self-reported on popular social networking sites, despite the inaccuracies typical of social network data.