Loading stock data...
GettyImages 1652528256Technology 

Patronus AI Develops a Language Model Evaluation Tool Tailored to Meet Regulatory Requirements in Highly Governed Industries

In an exciting development, Patronus AI has emerged from stealth mode, launching its product and announcing a $3 million seed round. The company, co-founded by Rebecca Qian and Anand Kannappan, both former Meta researchers, aims to provide a solution for evaluating and testing large language models (LLMs) in regulated industries.

A Solution for Regulated Industries

The founders of Patronus AI have extensive experience in developing responsible AI. Qian led the NLP research team at Meta AI, while Kannappan helped create explainable ML frameworks at Meta Reality Labs. Their expertise has enabled them to create a robust solution for evaluating LLMs, particularly in industries where accuracy is paramount.

"We help companies make sure the large language models they’re using are safe," says Kannappan. "We detect instances where their models produce business-sensitive information and inappropriate outputs." The company’s goal is to provide an unbiased, independent perspective on model evaluation, making it a trusted third party for businesses.

The Evaluation Process

Qian explains that the evaluation process involves three key steps:

1. Scoring

"First, we help users actually score models in real-world scenarios," she says. "This involves evaluating models against specific criteria, such as hallucinations." Hallucinations refer to instances where a model provides an answer it cannot verify, often due to lack of data.

2. Test Case Generation

Next, the product automatically generates adversarial test suites and stress tests the models against these tests. This step ensures that models are evaluated under various scenarios, simulating real-world usage.

3. Benchmarking

Finally, Patronus AI’s platform benchmarks models using various criteria, depending on the requirements, to identify the best model for a given job. "We compare different models to help users identify the best model for their specific use case," says Qian. For example, one model might have a higher failure rate and hallucinations compared to another base model.

A Key Player in Highly Regulated Industries

Patronus AI is focusing on industries where accuracy and reliability are critical, such as finance, healthcare, and government. "In these sectors, wrong answers can have severe consequences," notes Kannappan. By providing a robust evaluation framework, the company helps businesses ensure that their LLMs meet regulatory requirements.

Diversity and Inclusion

As the industry grows rapidly, Patronus AI is committed to maintaining an inclusive work environment. Qian emphasizes the importance of diversity: "It’s something we care deeply about. And it starts at the leadership level at Patronus." The company plans to continue instituting programs and initiatives to foster an inclusive workspace as it expands.

$3 Million Seed Round

The $3 million seed round was led by Lightspeed Venture Partners, with participation from Factorial Capital and other industry angels. This investment will help Patronus AI scale its product and expand its team in the coming months.

A Trusted Partner for LLM Evaluation

As the demand for LLMs continues to grow, Patronus AI is well-positioned to provide a trusted evaluation framework. With its founders’ extensive experience in developing responsible AI and its commitment to diversity and inclusion, the company is poised to become a key player in the industry.

Key Takeaways

  • Patronus AI has emerged from stealth mode with a robust solution for evaluating and testing LLMs.
  • The company’s evaluation process involves scoring, test case generation, and benchmarking.
  • Patronus AI focuses on highly regulated industries where accuracy is paramount.
  • The company emphasizes diversity and inclusion in its workplace culture.

Related Resources

Related posts