Today, women comprise only 25 percent of the STEM workforce, 4 percent of Fortune 500 CEOs, and earn 79 cents for every dollar paid to men, amounting to an average income difference of $10,762 per year. The numbers tell the story—gender inequality is still a pervasive problem in the U.S.
Women empowerment initiatives, such as Sheryl Sandberg’s “Lean In” movement, have gained momentum in recent years and organizations have become increasingly interested in promoting gender equality in the workplace. But the question remains of how best to do so.
According to Iris Bohnet, a professor of public policy at Harvard University’s Kennedy School of Government, good behavioral design is part of the solution. In her new book, What Works: Gender Equality by Design, Bohnet describes the advantages of behavioral design and discusses the evidence-based tools available to help organizations correct biases and foster gender equality.
Alisa Yu: Why should schools, businesses, and governments focus their efforts on behavioral design as opposed to other methods, such training or educating people about the importance of gender equality?
Iris Bohnet: Sadly enough, we don’t have enough evidence on whether training programs work or not. Most companies that incorporate diversity training programs have not evaluated them in any formal sense. The little we do know is either based on correlational analyses or on diversity training programs outside the corporate sector. The analyses conducted by Alexandra Kalev at Tel-Aviv University and Frank Dobbin in the Harvard Sociology Department examined whether diversity programs and workforce diversity are in any way correlated. They pointedly found that there was basically no relationship. That was certainly not very encouraging.
Betsy Paluck, a psychologist at Princeton, conducted some interesting field experiments in Rwanda. She also didn’t find that reconciliation training changed mindsets, but it did change what people thought were appropriate behaviors. That’s how I came to thinking about de-biasing organizations rather than mindsets. Generally as behavioral scientists, we haven’t been particularly good at using training to help people change their minds in sustainable ways. More encouraging news comes from the nudge approach, which focuses on changing the environment.
AY: What recommendations do you have for organizations who want to reduce bias and promote gender quality in the workplace?
IB: Organizations can start by thinking about how they attract new employees. Many job advertisements include gendered language, even job advertisements that target counter-stereotypical individuals. For example, even schools that would like to increase their fraction of male teachers still use the words “compassionate” or “caring” in the job advertisement, words [that] are stereotypically associated with women. Research [shows] that controlling for the stereotypical gender of the job, women are more likely to apply to jobs that use more feminine stereotypical language, whereas men are more likely to apply to jobs that use more masculine stereotypical language.
The second thing that organizations can and should think about is the screening and interview stage. There are many biases that can creep into the initial screening. For example, we are influenced by someone’s looks when we know that appearance is not a good predictor of future performance or trustworthiness. To make sure that at least the first screening is done objectively, many companies blind the resumes at the entry stage so as not to be influenced by demographic characteristics. [But] once you’ve screened and focused in on the quantitative aspects of an application, how do you evaluate job candidates during interviews? Sadly enough, research shows that unstructured interviews are terrible predictors of performance. I don’t believe that organizations will do away with personal interviews in the near future, but they should move to structured interviews and leave unstructured interviews behind. Organizations should also think about replacing unstructured interviews with work sample tasks. Work sample tasks are behavioral tests that are related to what people actually have to do in the profession. The goal of these tasks is to assess as accurately as possible a candidate’s aptitude for the future job.
“Companies that use potential, in addition to performance, as a way to evaluate employees are more likely to be gender-biased.”
Thirdly, an important insight of behavioral science is that every judgment, every evaluation we make—not just of the people applying, but also of the small things, like coffee that we drink—is relative to what we’re used to. When we evaluate job candidates, we tend to want to compare them to candidates we’ve always seen or people who are typically in this kind of job. Max Bazerman, Alexander van Geen, and I have shown in our research that if evaluators compare a job candidate with a real alternative rather than a stereotype in their heads, they’re actually able to focus better on performance, and don’t need to rely as much on the stereotypical reference point to calibrate their judgments.
AY: One of your recent collaborations was with the Behavioral Insights Team to develop a hiring platform called Applied. In your opinion, how have technological tools advanced our progress towards gender equality?
IB: I think we’re at the beginning, actually. I don’t think we have that many tools yet. But I think it’s a relatively safe prediction that technology and machine learning will play a more important role in HR. Along with Unitive and GapJumpers, Applied is a tool that tries to help people make more objective decisions. I think we’ll be seeing more of these tools, and I do believe that they’ll play a huge role in HR. Many companies only have the resources but not the expertise, so I think they will be relieved to learn about technology that can help them do a better job with selecting the right kinds of job candidates.
AY: What are some things schools or policymakers should do to promote the success of both male and female students?
IB: It starts with the kind of tests that we design. Since the beginning of March 2016, for the first time, the SAT is going to be gender debiased in its multiple-choice section. A former student of mine, Katherine Baldiga Coffman, conducted really interesting research to understand whether multiple-choice tests, such as the SAT, [which] included a penalty [for] the wrong answer, led to differentiated responses from risk-loving people and risk-averse people. It turns out for the old SAT that if you could exclude at least one alternative out of the five, you should guess. It was the dominant strategy to guess. But if you’re risk averse, you’re more likely to skip the question than to guess.
What Katie found was that women were significantly more likely than men to skip, which controlling for their ability, costs them dearly on the SAT. The College Board redesigned the SAT with a lot of changes, but among them was to take out the penalty for wrong answers. In an interesting way, it does level the playing field because now the willingness to take risks no longer plays a role in willingness to answer.
AY: For organizations that are new to behavioral design, what would you recommend they prioritize first if they want to create fairer opportunities for women who are already in their organization?
IB: These days, many organizations are struggling with performance appraisals. In my work with companies, I analyze who benefits from current appraisal structures and how appraisal processes are conducted. Here’s what we’ve found: companies that use potential, in addition to performance, as a way to evaluate employees are more likely to be gender-biased. That’s not surprising because potential is less easily measurable than performance. Secondly, we generally find that leadership is associated with men, and potential has something to do with career advancement and climbing up the career ladder. We don’t necessarily associate career and leadership with women.
“We haven’t been particularly good at using training to help people change their minds in sustainable ways. More encouraging news comes from the nudge approach, which focuses on changing the environment.”
I also recommend to every organization not to share employee self-evaluations with managers. Behavioral science suggests that we are anchored by numbers thrown at us. So if my team members consisted of both men and women, I might find that women give themselves lower ratings. If I haven’t seen the ratings as their supervisor, I might give them both the same ratings. But now that I’ve seen their differential ratings, I might downgrade him a little and upgrade her little—I’d her make a bit happier and him slightly less happy, but the differential would still be bigger than if I had rated them without self-evaluations. I don’t think sharing self-evaluations are useful—certainly not useful when the manager has yet to make up his or her mind.
AY: As work becomes increasingly collaborative, it’s important to think about diversity in team settings. In your book, you mention that when you form teams in your classes, you don’t always create heterogeneous groups. Can you talk about why that is and what other factors you consider when forming a team?
IB: I’m very concerned about critical mass. There’s a lot of evidence suggesting that being the only one, the only woman, the only American, the only non-native speaker, or the only economist on a team will turn that team member into a token. Tokenism isn’t good for anyone—it’s not good for the team member or for the functioning of the team. So when I form teams and I don’t have, for example, an equal representation of men or women in the classroom, I sometimes have to form homogenous teams to make sure that none of the teams include only one person from the underrepresented sex. So numbers matter. Research shows that diversity in all its shapes and forms, not just in demographics but also in cognitive diversity, leads to the collective intelligence of a team. But I think some of the diversity discussion may have focused too much on just the numbers game. Making the diversity work on a team is much more than having the right fractions of different groups represented.
Iris Bohnet, Professor of Public Policy, is a behavioral economist at Harvard Kennedy School, combining insights from economics and psychology to improve decision-making in organizations and society, often with a gender or cross-cultural perspective. Professor Bohnet served as the academic dean of the Kennedy School, is the director of its Women and Public Policy Program, the co-chair (with Max Bazerman) of the Behavioral Insights Group, an associate director of the Harvard Decision Science Laboratory, and the faculty chair of the executive program “Global Leadership and Public Policy for the 21st Century” for the World Economic Forum’s Young Global Leaders.
Alisa Yu graduated from Rice University in 2012, where she studied Psychology, with a focus in Industrial/Organizational research. She is currently working at the University of Pennsylvania as a Research Coordinator in the Duckworth Lab, and will begin a PhD in Organizational Behavior at Stanford’s Graduate School of Business in the Fall of 2016.
Further Reading and Resources
- Bohnet, I. (2016). What Works: Gender Equality by Design. Cambridge, MA: Belknap Press.
- Baldiga, K. (2014). Gender Differences in Willingness to Guess. Management Science, 60(2), 434-448.
- Bohnet, I. (2016). How to Take the Bias Out of Interviews. Harvard Business Review.
- Bohnet, I., van Geen, A. & Bazerman, M. H. (2016). When Performance Trumps Gender Bias: Joint versus Separate Evaluation. HKS Faculty Research Working Paper Series RWP12-009, John F. Kennedy School of Government, Harvard University.
- Kalev, A., Dobbin, F., & Kelly, E. (2006). Best Practices or Best Guesses? Assessing the Efficacy of Corporate Affirmative Action and Diversity Policies. American Sociological Review 71(4): 589-617.
- Paluck, E. L., & Green, D. P. (2009). Deference, Dissent, and Dispute Resolution: An Experimental Intervention Using Mass Media to Change Norms and Behavior in Rwanda. American Political Science Review 103(4): 622-644.