The Big Brain in the Black Box

By Kathleen Ryan

Artificial intelligence, machine learning and alternative data are increasingly touted for improving fraud detection, compliance, credit underwriting, and banking operations in general. For credit underwriting, AI and alternative data may result in more accurate underwriting decisions and may help to expand credit to consumers without full credit histories as measured by current practices.

As banks consider incorporating AI and alternative data into their operations and decision-making processes, they must weigh the anticipated benefits against the untested nature of these technologies, the fair lending risks involved with using complex “black box” algorithms and the new data that has not previously been used in credit decision-making.

Applying half-century-old laws like the Fair Housing Act and the Equal Credit Opportunity Act to new technologies can be challenging. When the laws and related guidance were written, consumers applied for credit at a local bank branch, and underwriting and pricing decisions were primarily based on manual processes. Then, the primary fair lending concern was discriminatory policies or bank employees who would let individual bias creep into decision-making processes. Today, regulators encourage innovation, but they express concerns that AI and machine learning could have hidden biases. Moreover, the sheer number of attributes considered by advanced systems might result in unintentional discrimination against protected classes.

Alternative data gets a few careful nods from regulators

Alternative data include rent and utility payment history, educational attainment, social media use and other behavioral information not traditionally factored into credit decisions. These data can help lenders assess the creditworthiness of consumers who may lack experience with traditional forms of credit, such as credit cards. However, uncertainty about how regulators view alternative data have made lenders hesitant to incorporate alternative data into their decision-making.

While regulators clearly expect banks to use new technology without violating legal standards, it is less clear what level of scrutiny a bank must apply to emergent technologies and how it can possibly review alternative data sets that may involve thousands of data elements. To add complexity to a bank’s due diligence, AI and alternative data sets may be the intellectual property of vendors or other third parties. Banks may not even have access to the black box. If a bank has access to the underlying technology, it may lack in-house expertise to fully analyze the models and data used.

Recently, however, regulators have offered a few guideposts that may help banks define an approach to putting some of the new solutions to work. In December 2019, the OCC, FDIC, Federal Reserve, NCUA and the CFPB issued an interagency statement on the use of alternative data in credit underwriting. In it, the agencies encouraged the responsible use of alternative data and highlighted tools widely in use today to address fair lending risk in traditional underwriting and traditional data: strong compliance management systems, including testing, monitoring and controls, as well as model risk governance more generally for safety and soundness purposes.

More significantly, the interagency statement offers some clues as to how regulators think about fair lending risks from use of alternative data in credit decision-making. The agencies cite the use of cash flow data to analyze an applicant’s ability to repay a loan as presenting low risk, first because the data is directly related to a consumer’s finances, then because the data comes from reliable sources (that is, bank accounts) and finally because it would be relatively easy to explain in an adverse action notice under ECOA. By implication, then, banks can assume that regulators would view alternative data that is non-financial in nature and gleaned from non-traditional sources as riskier from a fair lending perspective.

The second case in which a regulator has weighed in on alternative data involves Upstart Network’s successful application to the CFPB for a no-action letter. Upstart uses alternative data, including applicants’ educational and occupational attainment, along with traditional data, and machine learning in credit underwriting and pricing decisions. When the CFPB granted the NAL in 2017, one of the conditions was that Upstart agreed to test its model against a more traditional model and share the results with the CFPB. In 2019, the CFPB reported that the testing results were encouraging. Upstart’s model appears to expand access to credit compared to a traditional model, and approval rates and APRs for minority, female, and older applicants “showed no disparities that require further fair lending analysis under the compliance plan.”

The Upstart NAL is a positive development for lenders considering using alternative data in credit decisions. However, an NAL requires a formal application that must identify not only the benefits to consumers of using alternative data, but also the risks to consumers and how the applicant will mitigate them. The path to approval may be lengthy. Upstart’s CEO recently told the House Financial Services Committee that it took “several years of good faith efforts between Upstart and the CFPB to determine the proper way to measure bias.” And for the rest of industry, an NAL only protects the recipient—in this case, Upstart—from CFPB supervisory or enforcement action and does not signal the bureau’s general approval of the alternative data and models that Upstart uses.

Targeted marketing—more efficient, more risky

AI and alternative data usage increasingly offer banks the ability to target marketing to those consumers deemed more likely to be interested in a product or service. The more closely targeted marketing is, the more efficient it is likely to be. However, closely targeted advertising can raise more fair lending risk than broader marketing campaigns. Targeting raises fair lending risk either because it uses a factor that is, or may be a proxy or substitute for, a prohibited basis, such as sex, or the factor may exclude prospective applicants who are minorities or another protected class.

A recent example of how targeted marketing can cause legal and regulatory headaches involves the social media platform Facebook. Over the last few years, Facebook has been subject to multiple lawsuits and government enforcement actions, including actions by the Department of Housing and Urban Development. According to HUD’s complaint, Facebook violated the Fair Housing Act by redlining—that is, allowing advertisers to ensure that ads were not seen by certain Facebook users based on the neighborhoods they live in. Facebook also allegedly offered “hundreds of thousands of attributes” by which they could filter who would see ads, among them categories like “women in the workforce,” “moms of grade school kids,” “foreigners,” “Puerto Rico Islanders,” or people interested in “parenting,” “accessibility,” “service animal,” “hijab fashion,” or “Hispanic culture.”

Facebook has settled these cases with consumer advocacy groups by altering its advertising for housing and credit to avoid the use of ZIP codes and attributes that either are or correlate with prohibited bases. It has also agreed to allow all Facebook users to view ads for credit or housing anywhere in the United States. However, HUD’s FHA complaint against Facebook is still pending. HUD’s complaint alleges that, separate and apart from allowing advertisers to select audiences, “[t]o decide which users will see an ad, [Facebook] considers sex and close proxies for the other protected classes. Such proxies can include which pages a user visits, which apps a user has, where a user goes during the day, and the purchases a user makes on and offline.” HUD further alleges that Facebook varies the price of ads based in part on how likely a user is to respond to the ad, which is determined “by sex and close proxies” for prohibited bases.

Monitoring this case will be important to understanding the FHA risks of targeted marketing via social media. It goes without saying that banks engaged in targeted marketing should steer well clear of targeting on the bases described in the Facebook complaints.

Federal Reserve staff has highlighted the fair lending risks of targeted marketing and the need for careful review, analysis, and monitoring. However, reviewing marketing for factors that raise fair lending concerns is challenging, given the sheer number of attributes that may be used and because some attributes appear neutral but may have a disparate impact on protected classes in combination with others. For example, Zest Finance, which offers AI and machine learning for credit underwriting, recently noted that AI for used car loans could have a disparate impact on African Americans given the interaction of a car’s higher mileage and the consumer’s state of residence. Furthermore, a bank may not have access to the black box to be able to closely analyze how it works and what data are considered.

Recent actions alleviate some concern

As noted, AI and machine learning present challenges for banks trying to assess fair lending risks. Regarding disparate impact risk, the challenges and costs associated with analyzing and demonstrating that an alternative data variable or algorithm does not cause a disparate impact on a prohibited basis can overwhelm the capabilities of most banks and the business case for using new technologies. Nevertheless, regulators have issued explicit expectations for banks using AI. For example, the OCC cautioned banks about fair lending risks from AI in its 2019 Semiannual Risk Perspective, stating that bank management must understand and monitor underwriting and pricing models for potential disparate impact and other fair lending issues and be able to explain and defend the models’ decisions.

However, HUD has recently taken action that may make it somewhat easier for banks to consider using AI in credit decisions. In August 2019, HUD proposed revising its FHA regulations to set limits on the use of disparate impact theory in FHA claims, consistent with the Supreme Court’s 2015 decision in Texas Department of Housing and Community Affairs v. Inclusive Communities Project. The Court held that the FHA prohibits unintentional discrimination but discussed the need for courts to apply important safeguards to disparate impact claims to ensure that businesses are not held liable for impacts they did not cause. In light of the Supreme Court’s ruling, HUD’s proposed rule would give businesses certain defenses to claims of disparate impact involving the use of algorithms. These defenses could be an important first step—particularly if the other regulatory agencies issue similar statements—toward providing clear and authoritative guidance for banks as they consider whether to use AI or machine learning.

What’s next?

Many banks may decide to wait on venturing into AI and alternative data until regulators provide greater clarity. However, in an environment where technological capabilities outpace regulators’ capacity to issue guidance, that certainty may never arrive. Moreover, nonbanks are already experimenting with AI and other technologies, and they are likely to raise consumers’ expectations for faster loan approvals and more offers. Competitive pressure may build for banks to be able to offer the same services to their customers, forcing more and more banks to experiment out of necessity.

Banks that are on the fence may also want to consider that the current environment is more favorable to innovation than ever before. The banking agencies not only issued the joint statement on alternative data—limited though it is—they also have each taken separate steps to encourage innovation. The OCC has an Office of Innovation; the FDIC created FDiTech and is hiring a chief innovation officer to lead it; the Federal Reserve has office hours for innovation discussions; and the CFPB’s Office of Innovation recently issued new policies to encourage financial institutions to seek official relief from regulatory uncertainty, including the NAL policy as well as a compliance assistance sandbox policy.

Deciding when and how to implement AI, machine learning and alternative data tools is a major strategic undertaking. Banks will want to ensure that all stakeholders—including legal, risk and fair lending officers—are at the table when decisions are made.

Kathleen Ryan is VP and senior counsel for fair and responsible banking at ABA. In addition to prior service as deputy assistant director at the Consumer Financial Protection Bureau and senior counsel at the Federal Reserve, she has been an attorney in private practice and senior regulatory counsel for a large bank.