Connect with us

Tech News

Deep Cogito emerges from stealth with hybrid AI ‘reasoning’ models

Published

on

Abstract technology background with polygonal grid analyze and data. Analytics algorithms data. Quantum computing concept. Big data. Banner for business, science and technology.

Deep Cogito, a new company, has recently emerged from stealth mode, unveiling a family of AI models that are openly accessible and can seamlessly switch between “reasoning” and non-reasoning modes.

Reasoning models, such as OpenAI’s o1, have demonstrated significant potential in fields like mathematics and physics due to their ability to fact-check themselves by methodically working through complex problems. However, this enhanced reasoning capability comes at the expense of higher computing resources and latency. To address this, companies like Anthropic are exploring “hybrid” model architectures that combine reasoning components with standard elements, allowing for quick responses to simple queries while taking more time to tackle challenging questions.

All of Deep Cogito’s models, collectively known as Cogito 1, are hybrid models. The company claims that these models outperform similar-sized open models from Meta and Chinese AI startup DeepSeek.

According to a blog post by the company, each Cogito 1 model has the capability to answer directly or engage in self-reflection before responding, similar to reasoning models. These models were developed by a small team in a relatively short span of approximately 75 days.

The Cogito 1 models range from 3 billion to 70 billion parameters, with plans to introduce models with up to 671 billion parameters in the near future. Parameters are indicative of a model’s problem-solving abilities, with a higher number of parameters generally leading to better performance.

It’s worth noting that Cogito 1 was not built from scratch; instead, Deep Cogito leveraged Meta’s open Llama and Alibaba’s Qwen models to create their own models. The company applied innovative training techniques to enhance the base models’ performance and enable toggleable reasoning.

See also  3 action movies on Hulu you need to watch in December 2024

Internal benchmarking results from Cogito show that the largest Cogito 1 model, Cogito 70B, with reasoning capabilities, outperforms DeepSeek’s R1 reasoning model in certain mathematics and language assessments. Even with reasoning disabled, Cogito 70B surpasses Meta’s Llama 4 Scout model on LiveBench, a general AI testing platform.

All Cogito 1 models are available for download or can be accessed via APIs on cloud providers Fireworks AI and Together AI.

Cogito 1’s performance compared to other popular openly available AI modelsImage Credits:Deep Cogito

In a blog post, Cogito mentioned that they are still in the early stages of scaling, utilizing only a fraction of the compute power typically allocated for training large language models. They are exploring additional post-training methods for further enhancement.

Public filings indicate that Deep Cogito, based in San Francisco, was established in June 2024. The company’s LinkedIn profile lists Drishan Arora and Dhruv Malhotra as co-founders. Malhotra, formerly a product manager at Google’s DeepMind AI lab, focused on generative search technology, while Arora was a senior software engineer at Google.

Backed by South Park Commons, Deep Cogito has set its sights on creating “general superintelligence,” aiming to develop AI that can outperform most humans in various tasks and uncover new capabilities beyond current imagination.

Trending