³Ô¹ÏÍøÕ¾

Language AIs in 2024: Size, guardrails and steps toward AI agents

the intersection of artificial intelligence, natural language processing and human reasoning as the director of the at the University of South Florida. I am also commercializing this research in an that provides a vulnerability scanner for language models.

Author

  • John Licato

    Associate Professor of Computer Science, Director of AMHR Lab, University of South Florida

From my vantage point, I observed significant developments in the field of AI language models in 2024, both in research and the industry.

Perhaps the most exciting of these are the capabilities of smaller language models, support for addressing AI hallucination, and frameworks for developing .

Small AIs make a splash

At the heart of commercially available generative AI products like ChatGPT are large language models, or LLMs, which are trained on vast amounts of text and produce convincing humanlike language. Their size is generally measured in , which are the numerical values a model derives from its training data. The larger models like those from the major AI companies have hundreds of billions of parameters.

There is an iterative interaction between , which seems to have accelerated in 2024.

First, organizations with the most computational resources experiment with and train increasingly larger and more powerful language models. Those yield new large language model capabilities, benchmarks, training sets and training or prompting tricks. In turn, those are used to make smaller language models – in the range of 3 billion parameters or less – which can be run on more affordable computer setups, require less energy and memory to train, and can be fine-tuned with less data.

No surprise, then, that developers have released a host of powerful smaller language models – although the definition of small keeps changing: and from Microsoft, , and are just a few examples.

These smaller language models can be specialized for more specific tasks, such as rapidly summarizing a set of comments or fact-checking text against a specific reference. They can to produce increasingly powerful hybrid systems.

Wider access

Increased access to highly capable language models large and small can be a mixed blessing. As there were many consequential elections around the world in 2024, the temptation for the misuse of language models was high.

Language models can give malicious users the ability to generate social media posts and deceptively influence public opinion. There was a about this threat in 2024, given that it was an election year in many countries.

And indeed, a robocall faking President Joe Biden’s voice asked New Hampshire Democratic primary voters . OpenAI had to intervene to that tried to use its models for deceptive campaigns. Fake videos and memes were with the help of AI tools.

Despite the , it is on public opinion and the U.S. election. Nevertheless, U.S. states passed a large amount of governing the use of AI in elections and campaigns.

Misbehaving bots

Google started including in its search results, yielding some results that were hilariously and obviously wrong – unless you enjoy . However, other results may have been dangerously wrong, such as when it suggested to clean your clothes.

Large language models, as they are most commonly implemented, are . This means that they can state things that are false or misleading, often with confident language. Even though and continually beat the drum about this, 2024 still saw many organizations learning about the dangers of AI hallucination the hard way.

Despite significant testing, a chatbot playing the role of a Catholic priest . A chatbot incorrectly said it was “legal for an employer to fire a worker who complains about sexual harassment, doesn’t disclose a pregnancy or refuses to cut their dreadlocks.” And OpenAI’s speech-capable model forgot whose turn it was to speak and .

Fortunately, 2024 also saw new ways to mitigate and live with AI hallucinations. Companies and researchers are developing tools for making sure AI systems , as well as . So-called inspect large language model inputs and outputs in real time, albeit often by using another layer of large language models.

And the conversation , causing the big players in the large language model space to update their policies on and .

But although researchers are continually finding , in 2024, research that AI . It may be a fundamental feature of what happens when an entity has finite computational and information resources. After all, even human beings are known to from time to time.

The rise of agents

Large language models, particularly those powered by variants of the , are still driving the most significant advances in AI. For example, developers are using large language models to not only create chatbots, but to serve as the basis of AI agents. The term “agentic AI” , with some pundits even calling it the of AI.

To understand what an is, think of a chatbot expanded in two ways: First, give it access to tools that provide the . This might be the ability to query an external search engine, book a flight or use a calculator. Second, give it increased autonomy, or the ability to make more decisions on its own.

For example, a travel AI chatbot might be able to perform a search of flights based on what information you give it, but a tool-equipped travel agent might plan out an entire trip itinerary, including finding events, booking reservations and adding them to your calendar.

In 2024, new frameworks for developing AI agents emerged. Just to name a few, , , and were released or improved in 2024.

Companies are just AI agents. Frameworks for developing AI agents are new and rapidly evolving. Furthermore, security, privacy and hallucination risks are still a concern.

But global market analysts : 82% of organizations surveyed , and are likely to adopt AI agents in 2025.

The Conversation

/Courtesy of The Conversation. View in full .