Skip links

GenAI in HR Analytics – Strengths, Limitations, Workarounds — Part II

Jump To Section

GenAI in HR Analytics – Strengths, Limitations, Workarounds — Part II

The world is abuzz with terms like Generative AI (GenAI), Large Language Models (LLMs), ChatGPT etc. There’s a bit of a mysterious aura surrounding these areas. 

This is one of a series of blog posts which will attempt to clear the haze around these areas, from a practical standpoint. You might want to read Part-I of this same title, to get a fuller context of what we’ll talk about in this particular post.

Business Context — Voice of Employees (VoE) — continued from Part-I

In Part-I, we saw three practical challenges faced while using LLMs. Here’re a couple more, related to the same context.

Challenge 4:  Unaware of local context

Your team is building a chatbot for Employee Queries. An Employee can ask a question about HR policies, and get answers from the chatbot. That’s the expectation.

Here’s the catch, though…ChatGPT sitting out there has no clue about your organization’s HR policies. To be fair, it has never been trained on that. So, it may get lucky with the answers occasionally, but there is no way it can provide reliable answers in a reasonably consistent manner. 

Here’s a screenshot depicting one such interaction between an Employee and ChatGPT. The company name has been greyed out, for obvious reasons. 

So, what’s the solution? 

There are 3 options to be considered.

Option 1:  Simple prompt engineering

We could consider taking the easy way out, and simply bundle the context (here, HR policies) in the prompt/question provided to ChatGPT.

Unfortunately, this option may not work, since there is a limit to the number of words you can send to ChatGPT for each question. So, it would not be feasible to add large HR policy documents.

Option 2:  LLM Fine-tuning

Large Language Models (LLMs) like GPT, which underpin applications like ChatGPT, are trained using a very large corpus of documents. Sure, you can take those models, and train them further on your company’s HR policy documents. In LLM parlance, this is called ‘fine tuning’. That will make these models more aware of your company specific HR policies, in turn making its answers more relevant. But there is a time and cost associated with this exercise. It also needs to be repeated every now and then, to include the latest policy, changes in policies etc. Moreover, it has been practically seen in the industry that such fine-tuned models, while better than the ‘no fine tuning’ option, are still found lacking in terms of relevance of answers. One reason is that the amount of text contained in your company’s HR policy documents is still miniscule as compared to the humongous amount of data the models have been trained on, thereby limiting the ‘influence’ of these specific policy documents on the overall model results.

Option 3: Take the middle path

An underlying engine running in your organization can filter the context related to the question, and send that context along with the question, to ChatGPT. 

That way, you are combining the local Context Intelligence within your company, with the global Language Intelligence of the LLM. 

Your results will look much better now. Building the above automated engine is a tricky yet effective part of your overall solution.

=================================================

Challenge 5:  Insights beyond Language models

While discussing the four challenges so far in this post as well as the previous one, am hoping we have gained some intuitive understanding of the strengths and limitations of such language models. In this section, we will see how these language models can work with other engines to build powerful symbiotic ensembles.

Sticking with the context of HR Analytics, let’s say you want to understand the factors that influence attrition (or, positively speaking, retention) in your organization, how much does each of these factors influence retention, and how to predict retention/attrition

You know intuitively that there are several predictors of retention/attrition — learning opportunities, organizational culture, manager, how challenging/interesting the work is, current salary, difference between the current salary and the market median etc.

Many of the above factors (e.g. salary) are available as structured data. Traditional Machine Learning /AI engines are reasonably good at crunching such structured data to identify patterns, correlations etc, and use that information to predict the possibility of attrition.

At the same time, there are some critical factors like learning opportunities, work life balance etc which are subjective opinions/perspectives from the employee. These are present as unstructured (meaning, regular text) data that are not easily consumed by traditional ML/AI engines. This is where GenAI engines like LLMs come into play. They can convert these unstructured data into structured data. For example, you can use the LLM to create a sentiment score per review from an employee, and aggregate that to get an overall sentiment score from the employee. This can then be added to the rest of the structured data (like salary etc), to be fed into the ML/AI engine. So, you are effectively capturing the latent information potential of such subjective data and feeding that into the prediction process. 

To summarize, the Generative AI LLMs can be used not just on their own, but also in combination with other engines that can mutually strengthen each other’s capabilities, thus providing you with a better toolkit for informed decision making.

Jayaprakash Nair

Jayaprakash Nair

Latest Reads

Subscribe

Suggested Reading

Ready to Unlock Yours Enterprise's Full Potential?

Adaptive Clinical Trial Designs: Modify trials based on interim results for faster identification of effective drugs.Identify effective drugs faster with data analytics and machine learning algorithms to analyze interim trial results and modify.
Real-World Evidence (RWE) Integration: Supplement trial data with real-world insights for drug effectiveness and safety.Supplement trial data with real-world insights for drug effectiveness and safety.
Biomarker Identification and Validation: Validate biomarkers predicting treatment response for targeted therapies.Utilize bioinformatics and computational biology to validate biomarkers predicting treatment response for targeted therapies.
Collaborative Clinical Research Networks: Establish networks for better patient recruitment and data sharing.Leverage cloud-based platforms and collaborative software to establish networks for better patient recruitment and data sharing.
Master Protocols and Basket Trials: Evaluate multiple drugs in one trial for efficient drug development.Implement electronic data capture systems and digital platforms to efficiently manage and evaluate multiple drugs or drug combinations within a single trial, enabling more streamlined drug development
Remote and Decentralized Trials: Embrace virtual trials for broader patient participation.Embrace telemedicine, virtual monitoring, and digital health tools to conduct remote and decentralized trials, allowing patients to participate from home and reducing the need for frequent in-person visits
Patient-Centric Trials: Design trials with patient needs in mind for better recruitment and retention.Develop patient-centric mobile apps and web portals that provide trial information, virtual support groups, and patient-reported outcome tracking to enhance patient engagement, recruitment, and retention
Regulatory Engagement and Expedited Review Pathways: Engage regulators early for faster approvals.Utilize digital communication tools to engage regulatory agencies early in the drug development process, enabling faster feedback and exploration of expedited review pathways for accelerated approvals
Companion Diagnostics Development: Develop diagnostics for targeted recruitment and personalized treatment.Implement bioinformatics and genomics technologies to develop companion diagnostics that can identify patient subpopulations likely to benefit from the drug, aiding in targeted recruitment and personalized treatment
Data Standardization and Interoperability: Ensure seamless data exchange among research sites.Utilize interoperable electronic health record systems and health data standards to ensure seamless data exchange among different research sites, promoting efficient data aggregation and analysis
Use of AI and Predictive Analytics: Apply AI for drug candidate identification and data analysis.Leverage AI algorithms and predictive analytics to analyze large datasets, identify potential drug candidates, optimize trial designs, and predict treatment outcomes, accelerating the drug development process
R&D Investments: Improve the drug or expand indicationsUtilize computational modelling and simulation techniques to accelerate drug discovery and optimize drug development processes