Odds of dating someone with the same birthday
This also however tells us something about match probabilities within our MDM solutions, and critically it has an impact on how we view matching for Big Data, the larger the data set the it will be that false positives occur.
This means that when we create probabilities for name matching it should be driven not off a fixed assessment of likelihood but on a combination of factors including the number of instances that the name appears in the source records.
(This Web site can teach you how to calculate probability: Probability Central from Oracle Think Quest.) Observations and results Did about 50 percent of the groups of 23 or more people include at least two people with the same birthdays?
When comparing probabilities with birthdays, it can be easier to look at the probability that people do share a birthday.
Fw-300 .qstn-title #ya-trending-questions-show-more, #ya-related-questions-show-more #ya-trending-questions-more, #ya-related-questions-more /* DMROS */ .
The third person then has 20 comparisons, the fourth person has 19 and so on.
If you add up all possible comparisons () the sum is 253 comparisons, or combinations.
One of the questions that get asked regularly is ‘what is a good set of matching criteria’ and the answer is of course ‘it depends on what good quality information you can get’.
One piece that is put forwards is the combination of name & date of birth as being a good indicator of uniqueness. Well building on the Birthday Paradox, which shows that if you have 23 people in a room its better than 50/50 that two will have the same birthday, and for most people at school where the set is restricted to people around your age this normally meant two people with the same birth date (day, month, year), on one occasion at school I was in a class with two people with a birthday of August 31 so the question of ‘youngest’ in the year was a matter of a few hours.