Mystery Shopping: Taking the Mystery out of Fair Lending

By Nicholas Roesler, CRCM

Mystery shopping is a long-standing testing technique used by various agencies and entities to test for fair treatment in credit transactions, particularly in the housing market. The surging focus on racial equity across many different aspects of society and commerce includes more attention on fair lending practices at banks. For example, one of the immediate priorities for the Biden administration is to advance racial equity and civil rights. Similarly, racial equity is a current stated priority for the Consumer Financial Protection Bureau, with Acting Director Dave Uejio remarking, “It’s also time for the CFPB to take bold and swift action on racial equity. . . . This of course means that fair lending enforcement is a top priority and will be emphasized accordingly. But we will also look more broadly, beyond fair lending. . . .”

This article originally appeared as the cover story in the July/August 2021 issue of ABA Bank Compliance magazine.

Given the expected increase in the level of scrutiny in concert with rising expectations for overseeing and rooting out unlawful credit discrimination, now is a crucial time for fair lending compliance professionals to be thinking beyond traditional areas of fair lending focus. Understanding mystery shopping both as a form of internal self-testing and as a technique that your bank may be subjected to by external organizations is just one of many relevant topics to raise your fair lending IQ.

What is mystery shopping?

The general concept behind this type of fair lending testing is to have two individuals that are carefully matched for creditworthiness characteristics (and often for similar personal appearance) walk into a bank and apply for credit. However, these two individuals should have a different prohibited basis group characteristic, such as race or sex. Then, the treatment of the pair should be observed and the customer outcomes reviewed. Typically, this is not done by a third party who watches the interactions, but the “mystery shopper” individuals themselves would experience and document their treatment.

In many cases, the prohibited basis group tester is even positioned to have slightly better credit than the control group tester. The mystery shopping is mainly focused on the application or pre-application phase of the credit process. In this phase, the test can determine whether the testers’ experiences raise fair lending concerns related to overt discrimination or disparate treatment, including discouraging an application.

Discouragement is covered in Section §1002.4(b) of Regulation B, stating that a “creditor shall not make any oral or written statement, in advertising or otherwise, to applicants or prospective applicants that would discourage on a prohibited basis a reasonable person from making or pursuing an application.” The official interpretation of Regulation B gives examples of discouragement, including, “A statement that the applicant should not bother to apply, after the applicant states that he is retired.” Other areas of fair lending concern that can be evaluated during a mystery shopping test include overtly discriminatory comments, differing levels or quality of assistance, differing terms or conditions quoted such as loan rate, payment, or loan amount, differing information regarding eligibility, and/or other behavior.

Mystery shopping examples

As mentioned, mystery shopping is not a new testing tool. The following are a handful of examples of mystery shopping conducted by external nonbank entities.

As a part of a 2016 joint fair lending action by the CFPB and Department of Justice (DOJ), the CFPB disclosed its first use of mystery shopping, and noted that other government agencies and housing organizations, “have used testers for decades as a method of identifying discrimination.” The CFPB press release stated: “As part of its investigation, the CFPB sent testers to several BancorpSouth branches to inquire about mortgages, and the results of that testing support the CFPB and DOJ allegations. The agencies allege that, in several instances, a BancorpSouth loan officer treated the African-American tester less favorably than a white counterpart. Specifically, the complaint alleges that BancorpSouth employees treated African-American testers who sought information about mortgage loans worse than White testers with similar credit qualifications. For example, BancorpSouth employees provided information that would restrict African-American consumers to smaller loans than white testers.”
In 2018 the National Fair Housing Alliance conducted an investigation of eight automobile dealerships in eastern Virginia using paired testing mystery shopping that covered auto lending. Information from the NFHA stated the following about the matching methodology: “NFHA conducted one paired test at each dealership. Within each test, a White tester and a better qualified non-White tester inquired about purchasing the same new 2017 car with the same vehicle identification number within 24 hours of one another….Each tester was equipped with a concealed digital audio recorder that captured his or her experience at the dealership from arrival to departure.”
In November 2020, a study was released by the National Community Reinvestment Coalition regarding Paycheck Protection Program loans that used matched-pair testing. NCRC performed past matched-pair “mystery shopper” tests in 2017, 2019, and earlier in 2020. The second round of this NCPR testing had interesting twist—the mystery shopping was done over the phone. Locations included 60 bank branches representing 47 financial institutions in the Los Angeles, California, metro area. There was a total of 30 male and 30 female multi-layered matched tests consisting of white, Black and Hispanic testers, totaling 180 interactions. The tester profiles were set up with racially identifiable names and voices that were predetermined to have characteristics that would signal a perceived race over the phone. It’s unclear if NCRC factored in gender identity when assigning gender to testers based on name and voice. Should gender identity assumptions be reliably made based on first name and tone/pitch of voice alone? What happens within a mystery shopping test when these assumptions are inconsistent with how individuals identify themselves or are non-binary?
On March 1, 2021, the New York attorney general entered into an agreement with an Ohio-based bank resolving an investigation into the bank’s alleged deceptive advertising practices. In early 2018, and again in 2019, the Buffalo Niagara Community Reinvestment Coalition conducted testing of the availability of a deposit product offered in New York. This testing found that while the deposit product subject to this review was advertised as being available, it was not available to the testers who attempted to use the program.
On Jan. 27, 2021, a 97-page report was released on fair housing and discriminatory practices involving real estate brokers and agents on New York’s Long Island. This joint work of three New York state committees was essentially a follow-up to a 2019 investigation by Newsday, which used mystery shopping. Per the New York report, “Newsday used paired testing, a practice regularly endorsed by federal and state courts as the sole viable method for detecting violations of fair housing law by real estate agents…. In its three-year probe, Newsday recruited and trained 25 individuals to pose as undercover homebuyers, tested 93 real estate agents, collected 240 hours of recorded interactions, and analyzed 5,763 house listings.” Newsday equipped testers with hidden cameras. The state of New York is reportedly seeking to suspend or revoke licenses for agents publicly cited in the investigation. Expecting to file more complaints, New York has also opened additional investigations. The NY report also recommends proactive investigation and enforcement of fair housing laws through testing (see next section for related lawmaking). While this example is not directly related to credit under Regulation B, it provides an illustration of the impacts and consequences of mystery shopping results.

Attention from lawmakers

Both federal and state lawmakers have shown interest for using mystery shopping as a way to investigate fair lending. In January 2021, the Fair Lending for All Act was reintroduced, which among other things, called for a new office within the CFPB to engage in mystery shopping-style testing. The bill states that “[t]he Office, in consultation with the Attorney General and the Secretary of Housing and Urban Development, shall conduct testing of compliance with the Equal Credit Opportunity Act by creditors, through the use of individuals who, without any bona fide intent to receive a loan, pose as prospective borrowers for the purpose of gathering information.”

On Feb. 11, 2021, the New York State Senate passed Bill S.112, which requires the New York attorney general to conduct annual fair housing testing to assess compliance with fair housing laws throughout New York State, including “covert investigations conducted for the purpose of comparing how members and non-members of a protected class are treated when they are otherwise similarly situated, and gathering evidence of compliance with fair housing provisions pursuant to Human Rights Law.”

Privileged and confidential?

An entire section of Regulation B, §1002.15, is dedicated to covering incentives for self-testing and self-correction. While mystery shopping can vary in methodology, it often falls under the category of “self-testing” which is defined in §1002.15(b)(1) of Regulation B which states:

A self-test is any program, practice, or study that:

1. Is designed and used specifically to determine the extent or effectiveness of a creditor’s compliance with the Act or this part; and
2. Creates data or factual information that is not available and cannot be derived from loan or application files or other records related to credit transactions.
3. Includes, but is not limited to, the practice of using fictitious applicants for credit (testers), either with or without the use of matched pairs.

Mystery shopping often qualifies as a self-test per regulatory definition, because the observations and experience of the tester is in and of itself, creating new information used to test for fair lending under Regulation B.

As stated in Regulation B, reports or results from voluntary self-testing are considered privileged information, meaning the results do not need to be shared with regulators (and/or other external entities); however, there are specific parameters that must be satisfied in order to assert privilege. First, any test or data that is required by law or government agency doesn’t qualify (§1002.15(a)(1)). Second, appropriate corrective action needs to be taken on the results. Appropriate corrective action is needed when the self-test indicates a likely violation of the Equal Credit Opportunity Act or Regulation B. If the results don’t indicate a violation, the corrective action prequalification for privilege is essentially automatic (with some caveats per §1002.15(c), see Corrective Action section below).

It’s up to the creditor conducting the self-test to determine if corrective action is required, and if the creditor should take further corrective action as necessary. The official interpretation of §1002.15(a)(1) is: “If a creditor’s claim of privilege is challenged, an assessment of the need for corrective action or the type of corrective action that is appropriate must be based on a review of the self-testing results, which may require an in camera [emphasis added] inspection of the privileged documents.”

In other words, the confidentiality of mystery shopping results is open to challenge by regulators. Further, a creditor can lose privilege if:

The results are voluntarily disclosed,
Privileged information is disclosed as part of a defense to charges of a Regulation B violation, and/or
If certain information, such required under §1002.12(b), can’t be provided to a regulator.

Lastly, it’s important to know that certain information related to the self-test is not privileged, including information stating whether a self-test was conducted, the methodology, scope, time period, or records related to the credit applications or loans (§1002.15(b)(3)).

What is ‘appropriate corrective action’?

As discussed above, the appropriateness of corrective action is a factor that determines whether mystery shopping is defined as a “self-test” under Regulation B, and is also a prerequisite for invoking privilege. So, what exactly is “appropriate corrective action”?

Regulation B and its official interpretation sets out these requirements in §1002.15(c).“Appropriate corrective action is required when it is more likely than not that a violation occurred, even though no violation has been formally adjudicated.” In determining the likelihood of a violation, the official interpretation to Regulation B instructs creditors to think of testers as if they were actual credit applicants and warns that a tester’s waiver of legal rights doesn’t change the general corrective action requirement, except for the fact that creditors are not required to provide remediation to testers.

Under a self-test, a root cause analysis is important. Regulation B requires creditors to take “corrective action that is reasonably likely to remedy the cause and effect of a likely violation.” This includes assessing policies, practices, and “the extent and scope of any violation” on a case-by-case basis. The official interpretation suggests that the scope of corrective action need only be focused on the scope of the self-test. If mystery shopping was focused on mortgage loan applications, the corrective action does not need to be expanded to all loan types and all stages of the credit process.

For example, if a creditor conducts pre-application mystery shopping for auto loans, then the focus of corrective actions should also correspond to the pre-application process for auto loans. And, the creditor is not required to expand testing to other types of loans. This also applies to the scope in terms of the location of branches or office where the violations likely occurred.

While both prospective and remedial corrective action should be considered, the official interpretation of Regulation B §1002.15(c) states, “the use of pre-application testers to identify policies and practices that illegally discriminate does not require creditors to review existing loan files for the purpose of identifying and compensating applicants who might have been adversely affected.” The official interpretation also includes the following examples of appropriate corrective action:

If the self-test identifies individuals whose applications were inappropriately processed, offering to extend credit if the application was improperly denied and compensating such persons for out-of-pocket costs and other compensatory damages;

Correcting institutional policies or procedures that may have contributed to the likely violation, and adopting new policies as appropriate;

Identifying and then training and/or disciplining the employees involved;

Developing outreach programs, marketing strategies, or loan products serve more effectively segments of the lender’s markets that may have been affected by the likely discrimination; and

Improving audit and oversight systems to avoid a recurrence of the likely violations.

Is this the right tool for your fair lending program?

Mystery shopping can be a useful tool to detect inconsistencies that point to potential fair lending risks. A Harvard University study published in 2008 describes many reasons why lenders and the nation should engage in self-testing of lending practices that go beyond compliance risk, including economic risk, ethics, legal liability, reputation risk, and business risk/

An effective fair lending program, or really any compliance management system (CMS), is set up to self-identify and self-correct issues (including customer redress or remediation). Intuitively, performing mystery shopping internally would allow a bank to identify weaknesses and mitigate them rather than having an issue surface from an external or public-facing view.

However, there are limitations including cost, scope, administration, implementation and complexity of execution. Moreover, possible third-party risks and other uncontrolled facts that arguably diminish the conclusiveness of mystery shopping results—with respect to unlawful credit discrimination—can also be viewed as limitations. As noted, the importance and readiness to take corrective action is also very key. At a minimum, this is a decision that should be made with careful consideration of multiple risks and in consultation with your trusted legal professional.

One basic question to consider is simply: Are your fair lending program fundamentals in good order? Before considering whether mystery shopping may be an appropriate addition to your bank’s fair lending program, you might find value in first exploring other ways to strengthen existing controls and oversight of pre-application risks. This could include a thorough review of branch procedures, banker interviews, branch visits, complaint monitoring, training and/or customer experience metrics.

While advocates suggest lenders engage in mystery shopping and other forms of self-testing, no public guidance from banking regulators requires or recommends mystery shopping as an example of an expected component of a CMS. The predominant federal banking regulators (Federal Reserve, OCC, FDIC, CFPB) have rarely engaged in mystery shopping. As mentioned, the CFPB only utilized this technique in one investigation since its inception, which appeared as a supplemental investigative measure used only in extreme cases. On the other hand, all of these regulators incorporate fair lending statistical testing and file reviews as a routine part of examinations. So, another question to ask in advance of developing a fair lending mystery shopping program is: Do you already have a strong monitoring and analytics program?

Another aspect to consider is the ever-increasing adoption of digital banking and what that might mean in the context of mystery shopping. The digital information trail enables a bank to capture interactions for future evaluation, perhaps even more viably than face-to-face branch customer interactions that are not typically recorded.

As noted in the NCRC PPP testing over the phone, studies have shown that the race of an individual can often be determined by name alone, and race can also be perceived through voice. While the testing was reportedly done via telephone because of the pandemic, when branches re-open, mystery shopping over the phone could continue and and prove to be an easier method for community groups to conduct mystery shopping. Might we eventually see reports of chatbots or artificial intelligence being deployed as a way to test these interactions in a highly controlled way? While the future is never certain, spending time to challenge the status quo of your fair lending program will always be worthwhile.

Nicholas Roesler, CRCM, CAMS, is SVP and fair and responsible banking officer at U.S. Bank. He leads the fair and responsible banking program, and is responsible for overseeing and managing fair lending, UDAAP, and HMDA risk across the enterprise. Prior to joining U.S. Bank, he was a commissioned examiner at the Federal Reserve Bank of Minneapolis, where he led consumer compliance exams and CRA evaluations.