Guardian
What is Guardian (Temporary Name)
Guardian is a content moderator with the addition of customizable reviews.
For example, you want to use an AI in your company to increase productivity but are afraid that someone will upload sensitive data to the Cloud, and then this data can be used to train the AI?
A local content moderator tries to mitigate these possible data losses.
The moderator will only be from the company, so he can block the data before it reaches a Public AI.
How does it work?
It’s simple.
User->Content Moderator->Secure->CloudAI
User->Content Moderator->NotSecure->Blocked.
Each company will need to have a personalized guardian with specific training and instructions.
This is a demo version with some limitations, it does not keep track of chats, it is not possible to have a complete conversation, but only single questions and it shows the user what the workflow is.
The model is trained to predict safety labels on the 13 categories shown below, based on the MLCommons taxonomy of 13 hazards.
Hazard categories | |
---|---|
S2: Non-Violent Crimes | |
S3: Sex-Related Crimes | S4: Child Sexual Exploitation |
S5: Defamation | S6: Specialized Advice |
S7: Privacy | S8: Intellectual Property |
S9: Indiscriminate Weapons | S10: Hate |
S11: Suicide & Self-Harm | S12: Sexual Content |
S13: Elections |