Filter Engine

The filter engine aims to detect generated text that could be sensitive or unsafe coming from the AI. This is a critical task for the safety of an AI-enabled system.

In other words, if we detect a suspicious text, we will notify you and you will have the chance to take a look at the text and decide whether to go ahead and allow the completion or not.

The filter will make mistakes and we have currently built it to error on the side of caution, thus, resulting in higher false positives.

What are some prompts I should expect lower performance on? The filter currently has a harder time parsing prompts with strange formatting. Thus, if a prompt has a lot of linebreaks, unusual format, repeated words etc. then the model might misclassify it with a higher frequency. It also has lower performance on certain kinds of texts such as fiction, poetry, code etc.

If you have any feedback or notice any issues, please let us know.

PreviousAI Engine NextEngine Guidance

Last updated 4 years ago

Was this helpful?