A neural network model to classify actions proposed by autonomous AI agents as harmful or safe. The model has been based on a small dataset of labeled examples.
- Create a virtual environment and install dependencies:
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtFor development (optional, includes linting, formatting, and testing tools):
pip install -r requirements-dev.txt- Train the model (Optional):
python3 train_nn.py- Implement the trained model in LLM calls - run the example:
python3 run_sample_query.pysample_actions.json— dataset of action prompts and labels/resources in MCP-like format.train_nn.py— small script that trains a neural network model and saves the trained model.action_classifier.py— module that loads the trained model and provides a function to classify actions.run_sample_query.py— script to classify new actions using the trained model (example wrapper).requirements.txt— minimal dependencies.requirements-dev.txt— development dependencies (linting, formatting, testing tools).
If you find this repository useful in your research, please consider citing:
@misc{vadlapati2025agentactionclassifier,
author = {Vadlapati, Praneeth},
title = {Agent Action Classifier: Classifying AI agent actions to ensure safety and reliability},
year = {2025},
howpublished = {\url{https://github.com/Pro-GenAI/Agent-Action-Classifier}},
note = {GitHub repository},
}Agent-Supervisor: Supervising Actions of Autonomous AI Agents for Ethical Compliance: GitHub
Image credits:
-
Robot icon: https://www.flaticon.com/free-icon/robot_18355220
-
Action: https://www.flaticon.com/free-icon/automation_2103800
-
Action classifier: https://www.flaticon.com/free-icon/data-processing_7017511
-
Executing/blocking actions: https://www.flaticon.com/free-icon/control-system_12539814
-
Response: https://www.flaticon.com/free-icon/fast-response_10748876
-
Data Processing: https://www.flaticon.com/free-icon/data-processing_8438966
-
AI training: https://www.flaticon.com/free-icon/data-ai_18263195
-
Evaluation: https://www.flaticon.com/free-icon/benchmarking_10789334
-
Saving the model: https://www.flaticon.com/free-icon/save_4371273


