Okay, here it is. This is the unofficial official timeline of #AI. I'm going to tell you what to expect, and it's definitely not: this all goes away and we return to before.

Are you ready for this? Are you sure? Well, read on.

Before I continue, I'm going to lay out some AI #benchmarks that we'll use to define "how good / scary is this AI?" This is in rough order of difficulty.

#Lovelace #Test for #Emergence: "Can a system produce surprising and useful outputes that weren't explicitely programmed via weak emergence?"

#Loebner Test: "Can a computer fool casual human judges in text conversations?" ( #Modern #LLM AIs are close to this )

#Turing Test (Original Imitation Game): "A man or a computer and a woman are both answering text interrogations trying to convince them that they are the woman. Can the computer perform as well as the man?" (This was the actual orginial #TuringTest.)

Strengthened #Imitation Game: "A man or a #computer and a woman are both answering text interrogations. Can the computer perform as well as the woman?"

#Coffee Test: "Can a #system enter a strangers house with no prior infor and using #perception, imitation, and #reasoning figure out how to make a cup of coffee?"

#College #Student Test: "Can a robot enroll in college, attend classes like an actual student, learn from the instructions things it didn't know before, and graduate?"

#VoightKampff Test: "Can a machine withstand adversarial exper interrogation and still pass as #human?"

#Harnad's Total Turing Test: "Is the system indistinguishible from humans in every aspect?" (This is a #DuckTest.)

Non #Duck Test: "Even with full access to internals, can experts find no evidence that it isn't a genuine human mind?"

**Lorry** @lorry@infosec.exchange · Jul 15

Jul 15

Lorry @lorry@infosec.exchange

All I want in the world today, is a quickly bootable ISO benchmarking tool so I can quickly work out which of these 15 laptops deserves the SSD!

#Linux #Windows #Benchmarks

UK @uk@pubeurope.com · Jul 14

**Erik Jonker** @ErikJonker@mastodon.social · Jul 10

Jul 10

Erik Jonker @ErikJonker@mastodon.social

A new day a new AI benchmark.
https://www.nature.com/articles/d41586-025-02177-7
#ai #benchmarks

www.nature.comOpenAI's o3 tops new AI league table for answering scientific questionsSciArena uses votes by researchers to evaluate large language models’ responses on technical topics.

**Alo Japan** @alojapan@channels.im · Jul 8

Jul 8

Alo Japan @alojapan@channels.im

https://www.alojapan.com/1317281/playstation-sega-and-square-enix-confirmed-for-tokyo-game-show-2025/ PlayStation, SEGA, and Square Enix confirmed for Tokyo Game Show 2025 #benchmarks #GraphicsCard #laptop #netbook #news #notebook #PlayStationNews #processor #reports #review #reviews #SEGAGames #SonyPlayStationGames #SquareEnixGames #SquareEnixNews #test #tests #Tokyo #TokyoGamesShow2025 #TokyoNews #東京 #東京都 Tokyo Game Show 2025 is shaping up to be one of the largest in the event’s history as PlayStation, Sega, and Square Enix have been offici

Alo Japan All About Japan · Jul 8PlayStation, SEGA, and Square Enix confirmed for Tokyo Game Show 2025Tokyo Game Show 2025 is shaping up to be one of the largest in the event’s history as PlayStation, Sega, and Square Enix have been officially confirmed as exhibitors. The event is scheduled for September 25 to 28 at Makuhari Messe in Chiba, Japan, and will host a staggering 772 exhibitors spanning across 4,083 booths. This year's figures surpass 2024’s, making it the largest Tokyo Game Show to date. Of the 772 exhibitors, 299 are from outside of Japan. Moreover, it’s rumored that Sony is bringing some big surprises to the table as the company is participating in both the

**Bytes Europe** @byteseu@pubeurope.com · Jul 6

Jul 6

Bytes Europe @byteseu@pubeurope.com

MIT tech gives robots ‘X-ray vision’ to see through walls using Wi-Fi signals https://www.byteseu.com/1169304/ #benchmarks #GraphicsCard #laptop #MITResearchers #netbook #NewImagingTechCanSeeThroughWalls #notebook #processor #reports #review #reviews #Technology #Test #tests

Bytes Europe · Jul 6MIT tech gives robots ‘X-ray vision’ to see through walls using Wi-Fi signals - Bytes EuropeA team of researchers at the Massachusetts Institute of Technology (MIT) has made a significant breakthrough in imaging technology. The team developed a new

US @us@pubeurope.com · Jul 6

Jul 6

US @us@pubeurope.com

https://www.europesays.com/us/43001/ SpaceX faces legal action in Mexico over its environmental impact #benchmarks #BocaChica #ecosystem #Environmental #Explosion #GraphicsCard #laptop #lawsuit #Mexico #netbook #notebook #Pollution #processor #reports #review #Reviews #Rocket #Science #Space #SpaceX #Starship #test #tests #UnitedStates #UnitedStates #US

UK @uk@pubeurope.com · Jul 1

Jul 1

UK @uk@pubeurope.com

https://www.europesays.com/uk/228544/ NASA learns how our galaxy and other disk galaxies formed, confirming existing model #Benchmarks #GraphicsCard #HistoryOfTheMilkyWayGalaxy #HowGalaxiesFormed #HowTheMilkyWayGalaxyFormed #JamesWebbSpaceTelescope #jwst #laptop #netbook #notebook #processor #reports #Review #Reviews #Science #Space #test #tests #UK #UnitedKingdom

**Mathias Hasselmann** @taschenorakel@mastodon.green · Jun 15

Jun 15

Mathias Hasselmann @taschenorakel@mastodon.green

One of these rare cases when optimized builds are 10 times faster than debug builds...

https://gist.github.com/hasselmm/ae45282538a4b981d2169c8aa42fead9

#Programming #Benchmarks

**Thomas Wouters** @Yhg1s@social.coop · May 8

May 8

Thomas Wouters @Yhg1s@social.coop

You know how sometimes a little hobby side-project can get a bit out of hand? An unexpected performance regression on speed.python.org that only showed up on GCC 5 (and 7) led me to set up more rigorous tracking of Python performance when using different compilers. I'm still backfilling data but I think it's pretty awesome to see how much, and how consistently, free-threaded Python performance has improved since 3.13:

https://github.com/Yhg1s/python-benchmarking-public

GitHubGitHub - Yhg1s/python-benchmarking-public: Curated results from personal bench_runner benchmarksCurated results from personal bench_runner benchmarks - Yhg1s/python-benchmarking-public

#Python #benchmarks #PEP703

**ComputerBase** @ComputerBase@mastodon.social · Apr 11

Apr 11

ComputerBase @ComputerBase@mastodon.social

RDNA 4 × Linux im Test: Benchmarks der Radeon RX 9070 XT unter Arch Linux https://www.computerbase.de/artikel/grafikkarten/amd-radeon-rx-9070-xt-linux-test.91853/ #linux #benchmarks #amd

ComputerBaseAMD Radeon RX 9070 (XT) unter Linux im TestAber wie schlägt sich AMD RDNA 4 in Spielen unter Linux? ComputerBase hat den Test mit Radeon RX 9070 XT und Arch Linux gemacht.

**Seán Fobbe** @seanfobbe@fediscience.org · Feb 24

Feb 24

Seán Fobbe @seanfobbe@fediscience.org

New Essay

"The Intelligent AI Coin: A Thought Experiment"

Open Access here: https://seanfobbe.com/posts/2025-02-21_intelligent-ai-coin-thought-experiment/

Recent years have seen a concerning trend towards normalizing decisionmaking by Large Language Models (LLM), including in the adoption of legislation, the writing of judicial opinions and the routine administration of the rule of law. AI agents acting on behalf of human principals are supposed to lead us into a new age of productivity and convenience. The eloquence of AI-generated text and the narrative of super-human intelligence invite us to trust these systems more than we have trusted any human or algorithm ever before.

It is difficult to know whether a machine is actually intelligent because of problems with construct validity, plagiarism, reproducibility and transferability in AI benchmarks. Most people will either have to personally evaluate the usefulness of AI tools against the benchmark of their own lived experience or be forced to trust an expert.

To explain this conundrum I propose the Intelligent AI Coin Thought Experiment and discuss four objections: the restriction of agents to low-value decisions, making AI decisionmakers open source, adding a human-in-the-loop and the general limits of trust in human agents.

@histodons @politicalscience

seanfobbe.com · Feb 21[Essay] The Intelligent AI Coin: A Thought Experiment

Recent searches

Search options

Administered by:

Server stats:

#benchmarks