A recent study introduces a new benchmark for assessing artificial intelligence (AI) by measuring the duration and complexity of tasks AI systems can handle versus humans. While AI agents already excel at short, simple tasks, their performance declines significantly with longer, more complex ones. The study found that the length of tasks AI can reliably complete has doubled approximately every seven months over the past six years, indicating exponential growth. This trend suggests that AI may be capable of automating a month’s worth of human software development by 2032. Experts see the new metric as a valuable way to gauge AI’s real-world potential, signaling an imminent emergence of generalist AI agents capable of managing varied and prolonged tasks, which could revolutionize business operations and daily life.
Related articles:
New AI system can ‘predict human behavior in any situation’ with unprecedented degree of accuracy, scientists say
Meta AI takes first step to superintelligence — and Zuckerberg will no longer release the most powerful systems to the public
AI outsmarted 30 of the world’s top mathematicians at secret meeting in California
Would outsourcing everything to AI cost us our ability to think for ourselves?
How AI companions are changing teenagers’ behavior in surprising and sinister ways





























