A new report from the Federal Trade Commission outlines several risks related to discrimination for companies using “big data” to make decisions about credit, employment and marketing. While the report noted that big data has the potential to expand access to credit by creating opportunities for underserved populations, it said that creditors and other big data users must be aware of the legal framework in which data-driven decisions must be made.
To help companies mitigate legal risks of using big data, the FTC encouraged data users to evaluate how representative their data sets are, whether their data model accounts for biases, how accurate the model’s predictions turn out to be and whether relying on big data raises fairness concerns. For example, the report said that reducing a credit limit based on the payment history not of the customer but of similar individuals would be unfair.
The report also noted that “simply adding more data does not necessarily correct inaccuracies or remove biases.” Even “very careful” analysis may be subject to scrutiny if a company’s models “use variables that turn out to operate no differently than proxies for protected classes,” the report added. To facilitate compliance, the report provides a list of questions companies using big data should consider.
The report said companies should understand the applicability of the Fair Credit Reporting Act, the Equal Credit Opportunity Act (and other civil rights laws) and the Federal Trade Commission Act in using big data. “Companies should review these laws and take steps to ensure their use of big data analytics complies with the discrimination prohibitions that may apply,” the agency said. “The Commission will continue to monitor areas where big data practices could violate existing laws … and will bring enforcement actions where appropriate.”
The use of big data is growing as the cost of acquiring, accessing, analyzing and storing various kinds of data continues to fall rapidly. Big data provides a foundation for predictive analytics, in which companies use a variety of data sets to make predictions about how customers might respond to a marketing pitch or how a loan might perform.