Maybe ChatGPT has some pre-frontal cortex problems

(solresol.substack.com)

19 points | by solresol 184 days ago

7 comments

MisterKent 184 days ago
This is a really odd way to test capabilities of an LLM. First, most photos of clocks are 10:10, since the training data for watches are usually set to 10:10 (in order to better sell watches etc).
Second, I don't think the photo generation aspect of chat gpt is being marketed or presented as a problem solving AI.
chomp 184 days ago
I like the part where the AI couldn’t be trusted to draw a clock, so we trusted it to psychoanalyze the incorrect clock
solresol 184 days ago
I administered the CDT to ChatGPT and got Claude to diagnose what was wrong with the "patient" based on the results.
There are signs of pre-frontal cortex damage or early stage dementia.
[-]
- pointlessone 184 days ago
  But does the patient get better or worse with each update?
pnm45678 184 days ago
Here's the thing (which you probably knew going in).. Generative AI is quite well-known to be terrible at drawing specific times on clock faces.
This is down to the training data. It has been trained on a huge amount of images.
That includes advertising. For whatever reason, wrist watch manufacturers have a tendency to set watches to 10:10 in ads, almost without exception. Perhaps it's just a nice-looking time, or it's good for comparison purposes.
Simply Google "wrist watch" and you'll see.
So, these generative models have a huge bias towards 10:10 on clock faces, because that's what all the clocks they've been trained on look like.
airstrike 184 days ago
FWIW, Claude 3.5 Sonnet got the SVG right on the first try: https://claude.site/artifacts/8dedf16e-b861-4497-96e2-872773...
Prompt was just "create an svg of a clockface with the time being 10 past 11"
pockybum522 184 days ago
I love the concept of the article where one LLM can't draw a simple clock but the other one can accurately diagnose medical conditions from a hypothetical drawn image.
batch12 184 days ago
It has sentience problems...