Analytics and Big Data against discrimination in hiring

Why are Data Science and Big Data effective methods of fighting against discrimination in hiring?

Revealing the real reasons for success, increasing the detection of “weak signals” (invisible to the naked eye), uncovers more unusual talents: analytics and big data are a reliable lever for avoiding cloning in employment access.

The opinion of Benoît Binachon, founder of the recruitment company Uman Partners and former founder and director of a data science company (effiScience).

The human mind has a tendency to provoke cloning. Firstly for psychological reasons (it’s reassuring), and then also because of a lack of capacity to cross-reference and sort data in a relevant way to really predict success.

Big Data* is today a good opportunity to widen traditional sourcing, opening the field to possibilities and even to fight against discrimination in hiring. With data that is really meaningful and with relevant models, this approach is in fact a reliable way to find successful profiles that the human mind would rarely have detected, and to multiply them. Therefore, it’s about giving a chance more frequently and easily to unusual profiles!

There are already many articles praising Big Data… Here I am proposing to look more attentively at how it works concretely.

Is there anything new in employment access via “Big Data”?

Analysis algorithms are sometimes old. For over 40 years, the Americans in particular have been designing tools for predicting the success of an individual in his or her future job, based on his or her leadership competencies and the characteristics of the company and the post. What is new in access to employment via “Big Data” is, without doubt, the ability to predict the success of the talent in his or her future post, using new web data. The appearance of social networks, and other spaces where the talent expresses and describes him or herself, have in fact allowed us to access a very large quantity of available data relating to individuals in their working environment.

What are the false ideas about using data for employment access?

Let’s dispel the first myth: no analysis worthy of the name will lead us to declare that, “in France you must recruit a man, a graduate from the Polytechnique, of at least 40 years old, to maximise the success in the given post”. In fact any good analyst will have detected the autocorrelations between the factors considered in this assertion and success (that we are seeking to explain). He will have therefore deleted them from the database allowing the analysis. This is another way of saying: “we already know that this category is on average more successful than others, but what we are interested in here is the more complex realities hidden below these factors”. In other words: “a good analysis does not confuse cause and effect”.

The second myth: no serious analysis of the success of a talent will make errors because the data entry is imprecise or even false. In fact the analytics can measure the “noise”, the error rate or the rate of false declaration, and thus allocate an indicator of reliability to the results (or even show that they are not meaningful). In order for the analysis not to detect the noise, the noise must be inconsistent. Which is impossible.

Once these two hurdles are overcome, we can understand the impending emergence of “Big Data” applications for access to employment. It will probably follow that all analytics work (in any domain) requires the participation and strong involvement of people in the business.

A clear example of Big Data as a decision-making tool in recruiting talent:

Using the LinkedIn profiles of the talents, experienced in a given career, for a homogenous social category, and thanks to text mining algorithms, we can construct a database linking all the characteristics of these talents, including their declared competencies, their number of “endorsements” and a characterisation of their career progression (text mining algorithms can again read the profile and qualify the success).

Using this database, we can calculate a predictive and prescriptive classification of the success of a talent in his or her future post; or construct explicative rules for this success. The granularity of these results depend on the predictive and explicative value of the data (it is perhaps quite weak if the data is very noisy, because the profiles are badly completed; the granularity could be fine if the majority of talents in question are transparent and precise in their profiles, and if their endorsements are representative).

How does a “recruiter” use these results? By testing the LinkedIn profile of a candidate applying for a post using the predictive model. And/or by examining, thanks to the explanatory model, the cases where the candidate is suited for the post: he has changed job enough + he has worked in multiple sectors + his latest experience was in a medium-sized company + etc.

What are the risks and opportunities for recruiters?

Of course, if we continue to “apply the model” indefinitely, even these atypical profiles will become classic: we will make clones (again!). So yes, we must let innovation, creativity and risk-taking generate new situations, which will also generate failures and successes, as well as new opportunities. This cycle is already current in other analytics applications (in marketing, in manufacturing, etc.).

Analytics and Big Data in employment access, as in all other disciplines, must be introduced with much rigour, professionalism and impartiality; there is no place for sorcerer’s apprentices. HR, agencies, administrations… must use the best professionals and collaborate very closely with them to create relevant approaches.

Benoît Binachon – Uman Partners – Executive Search For Data Driven Functions

See the interview “Why Analytics and Big Data are an effective way of fighting against discrimination in hiring” in RHAdvisor.

*Big data encompasses two strongly linked themes:

– the management of a very large quantity of available data (quantitative, but also textual)

– analysis using sophisticated or less sophisticated algorithms (from statistics and models — which describe laws — to Artificial intelligence — which discovers facts without necessarily having researched them and without necessarily explaining them).