When Increasing GenAI Model Temperature Helps: Beneficial 'Hallucinations'

In the context of Language Models (LLMs), "temperature" is a hyperparameter that controls the randomness of the model's output during text generation. It influences the probability distribution over possible next tokens (words or characters) given a certain input. LLMs, generates text by predicting the next token in a sequence based on the previous tokens. The model assigns probabilities to each possible next token. So, what does it have to do with temperature?

‍

The temperature parameter adjusts the confidence of these probabilities before sampling a token. If the temperature is below 1, this reduces the randomness and makes the output more deterministic and conservative, meaning that tokens with higher probabilities become even more likely, while those with lower probabilities become even less likely.

‍

On the other hand, if the temperature is above 1, as a result it "flattens" the distribution, increasing the probability of less likely tokens and adding more diversity and randomness to the output. This can make the text more creative but also more prone to errors or incoherence. And finally, if the temperature is 1, the model samples tokens directly according to the predicted probabilities without any adjustment.

‍

To illustrate the influence of the temperature on the text generation, consider a model generating the next word after "The sun is shining and the sky is":

Temperature = 0.5: The model might predict "blue" with high confidence, producing "The sun is shining and the sky is blue."
Temperature = 1.0: The model might choose among several common options like "blue," "clear," or "bright," leading to outputs like "The sun is shining and the sky is clear."
Temperature = 1.5: The model might include less common options such as "orange," "magical," or "dramatic," resulting in more varied completions like "The sun is shining and the sky is magical."

‍

Leveraging LLM Creativity for Enhanced Troubleshooting in Industrial Applications

‍

In the example above, we aimed for the LLM model to generate more creative responses. But why would this be desirable in an industrial context? Surprisingly, this characteristic can be very beneficial. During our recent GenAI deployment (see the link for more information), we were using repair manuals to identify the correct troubleshooting actions for various asset problems.

‍

The challenge was that not every asset type had the most up-to-date sections for every issue; some solutions were described in manuals for other machines. Initially, we set the temperature low to ensure the model found the correct remedy. However, this meant that some repair actions for specific asset types were missed, even though the proper instructions were available in other manuals.

‍

By increasing the temperature above 1, the LLM model began to identify relevant remedy instructions from closely related documents, similar to how a human would cross-reference materials. This approach allowed the model to provide accurate repair actions even when the exact instructions were missing from the specific manual, showcasing a truly remarkable capability.

‍

If you're eager to explore further, consider watching this video for additional insights. Better yet, if you're keen on implementing these strategies within your own organization, don't hesitate to reach out to us.

‍

See the video below to see the combination of Waylay and FLS VISITOUR in action:

‍

What’s next?

‍

Autonomous service operations is getting supercharged by the advent of smart synthetic software agents, powered by Large Language Models (LLMs). These synthetic agents will assist human service agents to increase capacity and reduce tedious manual work, like root cause analysis of asset performance issues, updating work plans to deal with impending asset shut downs, etc.

‍

LLM technologies have matured enough to couple automated asset health monitoring with autonomous field job scheduling to improve asset uptime and Service Level Agreement adherence. Waylay’s analytics and orchestration platform can serve various agentic LLM applications for autonomous service operations that leverages the FLS VISITOUR scheduling engine to optimize the field force load and reduce wasted travel hours. The result is faster preventive asset maintenance activities, less human error during scheduling and an overall better end customer experience.

‍

Want to know more? Please get in touch with us here

When Increasing GenAI Model Temperature Helps: Beneficial 'Hallucinations'