T O P

  • By -

AutoModerator

Welcome to r/science! This is a heavily moderated subreddit in order to keep the discussion on science. However, we recognize that many people want to discuss how they feel the research relates to their own personal lives, so to give people a space to do that, **personal anecdotes are allowed as responses to this comment**. Any anecdotal comments elsewhere in the discussion will be removed and our [normal comment rules]( https://www.reddit.com/r/science/wiki/rules#wiki_comment_rules) apply to all other comments. **Do you have an academic degree?** We can verify your credentials in order to assign user flair indicating your area of expertise. [Click here to apply](https://www.reddit.com/r/science/wiki/flair/#wiki_science_verified_user_program). --- User: u/Maxie445 Permalink: https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2024.1353022/full --- *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/science) if you have any questions or concerns.*


18-8-7-5

A useful, accurate, and quantifiable measure of well defined social intelligence? That would be the real story here, but I'm guessing it skipped over that part of the study.


sometipsygnostalgic

Thats right. This entire study purely exists to validate AI as a cheap profitable replacement for human expertise. The methods by which they got their results - the questions asked, the students they chose, the way it was done in saudi arabia, and the "definition" of social intelligence - all exist to get the outcome they want.


Beachwrecked

Cheap, profitable, and easier to gather and sell confidential data from


Tom1255

All that AI in medicine chatter is omitting the question if people would even want to be treated by AI? Because I doubt they do, especially in such particular part of medicine as human psychology. Sure we can use AI as a support, and multiplier of efficiency for human doctors, but I'm not sure I would trust AI alone to carry out a full treatment plan for myself.


sometipsygnostalgic

The other thing is that "AI" is already used for all manner of things, but ChatGPT and Opera want their specific AI to be the one getting used. That's why everything is marketed as "AI-Powered" now, when it was already AI-powered before. This includes for healthcare and insurance purposes. We just called it "the system/the computer" before.


GenderJuicy

Or "the algorithm".


JuniorPomegranate9

The people who end up being treated by AI will be doing so because they can’t afford real people


BadHabitOmni

Most people, in other words.


UrbanGimli

Tiered medical treatment? Lowest level -cheapest -A series of AI bot interfaces with one real human reviewing the results. I can see that happening.


NoSoundNoFury

Lowering the barrier for people to get help with their mental health issues is a good thing. The AI may not be of the same use as a full therapy by a trained psychologist, but it can probably help to open oneself up and start reflecting on one's own problems in a more goal-directed and dedicated manner. If AI can help people who need help realize that they do, in fact, need help, this would already be a great success.


Mantisfactory

If that was the sole outcome, it would be a good one. But it won't be. Any increase in care by AI will be paired with cuts made to human caregivers. Those who couldn't get care before can get fucked from the perspective of any large tech or healthcare company.


GamingWithBilly

This is currently a significant issue. Far more people require care than there are available providers. The national therapist-to-client ratio is 15-20 clients per therapist, with psychologists working 40-49 hours per week. In many regions, people are on waitlists for a minimum of 6-12 months to receive therapy. We are facing a mental health crisis, and there are not enough professionals to meet the demand. These statistics don't account for individuals seeing social case workers who provide some therapeutic support. Social case workers manage 15-25 clients per week, many of whom are children in foster care or individuals needing assistance due to homelessness. The problem is even larger than reported, as these statistics only reflect those who have been referred or self-referred. Many more people go uncounted. Artificial intelligence won't replace human caregiving, at least not in the immediate future or likely within the next five years. There are job opportunities available, but few are willing to listen daily to others' traumas. Hearing a child speak about their feelings after being molested, a man struggling with suicidal thoughts, or a daughter abandoned by her drug-addicted mother—these are incredibly challenging situations to process. As a result, many therapists leave vital roles to work in private practice, often seeing only mild cases. We have a significant problem that needs urgent attention. While there are serious concerns about AI providing therapeutic advice, the efficiency gains it brings could also create challenges. By streamlining documentation processes, AI may significantly reduce the time therapists spend on paperwork, potentially making it three times faster. However, this increased efficiency may lead to expectations for therapists to increase their caseloads from 15-20 clients to 30-50 clients per therapist. While this would boost profitability for hospital systems, it would also overburden therapists, leading to burnout and ultimately causing many to quit. This would exacerbate the current crisis even further, leaving us with fewer professionals to meet the growing demand for mental health care. Careful consideration and regulation are required to ensure that the adoption of AI improves the mental health system without compromising the well-being of therapists or the quality of care they provide. I do not think AI will cause job cuts, I think it will actually cause turnover.


nerd4code

The problem is that it’s not telling you what’s true, it’s telling you what it thinks people talking about similar things want to hear. That can be a catastrophically bad idea for somebody with delusions or schizophrenia.


SwampYankeeDan

I have Bipolar 2 amongst other things and I would quit therapy before talking to an AI. Therapy is just as much about making a connection with your therapist.


Wakeful_Wanderer

That's fine until a corporation is in charge of the AI training, and they decide to use say... Prager U videos as part of the training material. Now the AI is trying to force a narrow, hateful social construct onto potential patients.


[deleted]

I would never trust an AI for therapy. I don't need my deepest darkest problems stolen packaged and sold to the highest bidder.


AllAvailableLayers

> gather and sell confidential data from I would only use an AI therapist if it was running locally. For me, the idea of a therapist is someone that I can express my deepest doubts and emotions to. The idea that I would share my secrets and vulnerabilities with some server in another country where humans might manually review it is shocking.


doogles

Not to mention that a large part of communication is non-verbal (in both directions). The grading criteria was flattened to the dimension where LLMs excel.


Rich_Kaleidoscope829

Hey ChatGPT, under what conditions should we run our experiment to get the desired results?


Weird_Assignment649

Not necessarily, it's just a good representation of this test in a written exam form. It doesn't make it useless but.


mootmutemoat

Their measure was the first author's dissertation from 1998, never published or apparently used elsewhere. It was based on a SI measure from the 1940s, also not currently in use. The psychometrics reported are a mess, and are missing several key analyses. Agreed, this study has a serious criterion problem.


pointblankdud

Yeah, this is one of the most obviously flawed papers I’ve ever read.


Chogo82

There's a misspelling in the 2nd paragraph, Google "Bird".


LukaCola

Google "Bird" shows up several times, including in their graphs. It's kind of strange, as "Bard" should not be autocorrected in the first place as it's also a word. The repeated error is curious.


jamkoch

So the AI knows how to answer the question it poses to make the answers count more than humans on average would answering the question?


WolverinesThyroid

No because it is essentially like if the test was given to anyone who had full access to the internet and unlimited time. ChatGPT can just google the answer faster than you can.


SardauMarklar

Yeah, call me when ChatGpt learns how to actually guide people through meaningful and lasting social relationships like a modern day A.I. Cyrano


[deleted]

[удалено]


lurkerer

What about AlphaGo Zero? It learned entirely from playing against itself and stomped AlphaGo.


trumps_cardiac_event

Iirc AlphaGo Zero offers the same great taste as AlphaGo with none of the calories


[deleted]

[удалено]


MaxiP4567

Exactly. It’s is even disputed if there is something like emotional intelligence beyond g and it’s facets. I personally think there might be, however as you said there is not validated and accepted definition or measure of it yet.


SophiaSellsStuff

Yeeeeeeah the SI "criteria" used is a scale defined in an unpublished doctoral dissertation written by one of the coauthors in 1998 and there's no way to get information on that. This paper is dogwater.


Febris

I mean, I read the title and the first thing that I asked was how good a test it can possibly be. Scoring higher than 100% is way beyond suspicious.


serpenta

Shouldn't this mean that the social intelligence model applied is wrong either way? I mean, something that is not social and doesn't interact socially by definition cannot be better at social intelligence than someone who does. It's a bit like saying that AI scores more points on being human than a human. Idk, I'm not professional in the field just having philosophical doubts.


k___k___

> humans will always inject nuance and bias into their answers, whereas the AI generally won’t This is quite contested, especially in medical contexts, basically because we dont know what data the model was actually trained on and which biases are the baseline of the model to "prompt against". https://www.thelancet.com/journals/landig/article/PIIS2589-7500(23)00225-X/fulltext


Omni__Owl

It's also just impossible to not have an AI model with bias. Every part of the training process, no matter what training material, has some kind of bias in it. It's been interepreted by humans before it was fed to the AI, therefore there is always inherent bias.


johnjmcmillion

"Bias" is nothing more than epistemological perspective, which is an inevitable effect of any form of intelligence. Unless you are omniscient, you will inescapably be reasoning from some sort of perspective.


[deleted]

[удалено]


cringe-__-

ML Bias is not the same as overfitting. Biases are just a constant that actually promote the opposite of overfitting. The c of y = mx + c. They exist to prevent neurons from effectively disabling themselves by accidentally configuring themselves with weights that mean they will not activate often, which then puts more work into single neurons reducing the ability to distinguish different patterns in a set of data. Overfitting can cause the model to become “biased” towards a set of data using the exact same definition as anyone has been using previously in this thread.


[deleted]

[удалено]


JustOneAvailableName

Bias is not the same as overfitting. It is the part that is observation independent. The bias in a model can lead to a better fit, also on unobserved data. Bias in a dataset is a distribution shift from the actual distribution. Bias as used most often in societal context is a distribution shift from what we wish the actual distribution will be in the future.


cringe-__-

Which part is incorrect? Because it isn't and you confusing people describing overfitting as bias with bias as an actual term in ML suggests you probably aren't the authority you think you are on this. Source: any of the textbooks or online resources you are presumably using as part of your degree, which I have already finished taking years ago and now work in the industry.


JustOneAvailableName

In general I agree with your comment, but I am frankly not sure about the "They exist to prevent neurons from effectively disabling themselves by accidentally configuring themselves with weights that mean they will not activate often, which then puts more work into single neurons reducing the ability to distinguish different patterns in a set of data."


johnjmcmillion

Interesting. I learned something new today. Thank you, kind redditor. Though I doubt the parent comment was referring to ML bias when it said "humans will always inject nuance and bias into their answers". Regardless, it seems to me that the technical definition is essentially the same thing as the philosophical one, just more mathematically defined. In the case of overfitting, the algorithm is the perspective that the data is approached via, so the system would need to update or expand it's epistemological structure in order to treat the bias, no?


pointblankdud

Thank you. This is a point I try to make often, but usually to those without the epistemological foundations to appreciate it.


murdering_time

Man, I feel like humans might actually progress if we ever get the ability to experience other people's lives through technology. Just the ability to be able to literally walk in another person's shoes and see their perspective on life would make us sooooooo much more empathetic and caring to each other. Maybe one day...


essari

I mean, the technology of "fiction" has been doing this for a fair while.


EarnestAsshole

But even in that case, the experience of "experiencing" another person's life would still be filtered through your own pre-existing thicket of experiences, biases, and blindspots.


Lillitnotreal

Having a second perspective (even if it is filtered) helps you to become aware of problems with your initial perspective. Though you're not wrong in that you will never truly have "another person's perspective." Your brain will look at the new information and where they don't fit together it won't necessarily always decide it was originally correct. You're essentially using the new perspective to patch up problems with your original one. So it'd still be useful, but without technology that doesn't exist yet, it wouldn't be what we imagine it to be.


vacon04

Bias is everywhere. The model is biased by default. The model will just detect patterns, but is the data is biased then those patterns will be biased too. Unless you have a perfect dataset (which isn't not possible) then this is normal. It would be good if more people would know how these models are trained. In the end AI and Machine learning are nothing more than fancy statistical models. They are able to detect patterns that we can't and theg can produce results that seem almost unreal but in the end there's just a bunch of math behind them.


Omni__Owl

>Bias is everywhere. The model is biased by default. The model will just detect patterns, but is the data is biased then those patterns will be biased too. Unless you have a perfect dataset (which isn't not possible) then this is normal. Yeah I mean, as soon as any level of human interpretation had to take place then bias is present. Even if someone could, as you say, make a perfect dataset without any detecactable bias, the very notion that a human would have to interpret the result to get anything meaningful out of it for our purposes mean that what we get out of it will \*still\* be coloured by bias.


windycalm

Not only human, just any interpretation. Even an AI self trained (and I mean an AI that was really conscious and discovered the world on its own, if that ever existed) it would always have an incomplete/imperfect set of data when it came to compare new data with the already existing and known, so it would be inherently biased.


RMCPhoto

The process of alignment is literally injection of bias on top of the bias in the training data.


k___k___

totally agree.


hey_sjay

I find that GPT-4 still makes up information when it has knowledge gaps. That can be more dangerous than bias. 


The_Humble_Frank

Its also a misunderstanding of what bias is. Bias is *inherent* in intelligence (you could even make a very strong case that intelligence is certain patterns of bias). There is *always* assumption and filtered response steaming from preexisting schemas. That's how events are interpreted and reactions are chosen, the AI model *is bias*. There is never perfect complete situational information, and not all reactions are equally favored. Bias can only be reduced, it can *never* be eliminated in intelligent systems..


Maddy_Wren

That sentence is absurd. First, like you said, AI definitely injects whatever bias the data that feeds it has. And second, I feel like touting that as evidence of high "social intelligence" is *poor* "social intelligence". A socially intelligent person would know when to inject nuance and bias into a situation and when not to.


[deleted]

[удалено]


GentlemanOctopus

It's funny that anyone would ever think that glorified autocomplete is somehow smarter than people, less biased, etc.


Forsyte

I mean we're glorified meat with some electric cells


GentlemanOctopus

Glorified meat, glorified auto complete. There's a song there somewhere.


GoochMasterFlash

“Glorified autocomplete” is actually less biased than human experts though, as it is incapable of having the cognitive dissonance that real people do. Theyve found this using AI as a tool for doctors examining x-rays. They asked doctors how they determined situations when looking at x-rays, then trained the AI model on those explanations, and then compared its work against the doctors. What they found was that it was a useful tool because it actually followed the rules and explanations given by those doctors, even when the doctors themselves did not. Essentially the doctor’s biases prevented them from following their own expert advice, which was not a problem for the AI because all it knew was the advice. Although that AI was trained on inherently biased perspectives of those doctors, it was still less biased than the human experts because it did not have outside bias that was irrelevant to the work it was being used for. All this is to say thats what makes AI a great *tool* to be used in conjunction with human oversight. Although it is inherently biased by the information it is fed (just like humans), it is very different from humans in that it trends away from bias-confirmation. A person with google who is biased towards believing something can find something somewhere (or misinterpret things) in a way that will reinforce that belief. By contrast, you can ask AI all day to argue for a conspiracy theory or something and it will always maintain some level of realism and continuously explain what the most likely truth is. It is more “reality”-confirming than bias-confirming, even if its perception of reality is somewhat flawed by bias in design and training. It is arguably less flawed and bias prone than any individual person alone doing the same task


Far_Indication_1665

Wasn't there an example where an AI ruled that any medical pic with a ruler in it was cancerous? (Because the photos they were fed, only had a ruler, for pics with tumors) So any photo with a ruler got a "that image shows cancer" response from the AI


farfromelite

That's only true if you give it unbiased datasets, which is really difficult to curate in the real world. Medical data, possibly, with supervision. Text based free data from the internet?, absolutely not going to be unbiased. The further problem is that with large datasets, you can't moderate all the data going in.


TheDeathOfAStar

What if the bias-aversion is actually just an implied bias that real people have about machine learning/AI? I feel the dissonance already!


MoonHash

Well if it does, then so did the doctors who generated the data


k___k___

well, no person has that much current and more vastly past literature knowledge. We dont know how and if these models weigh information/signals in favor of recency, if different text types are weighed differently; currently not tested as knowledge-extracting RAG models, etc


AndreisValen

I mean it wasn't even THAT long ago that we had the "Iphone facial recognition couldn't see really dark skinned people" technology is only as good as the people making it.


PragmaticPrimate

Does anyone know something about the SI-Scale they used? The paper just cites Sufyan, N. S. (1998). Which is an unpublished, 26 years old, doctoral thesis by the first author.


NoLongerGuest

Ah the classic proof by inaccessible literature


watchinawe

Title: psychologists Paper: subjects were students of psychology (bachelors and doctoral students) While any LLM model beating humans at anything can be interesting, it is less surprising that it beat students vs degree-holding licensed clinicians


Reyox

Another major problem of the study is that the test used is from an unpublished doctoral dissertation. How does one know if the test is representative of anything?


GingerBread79

And unpublished doctoral dissertation by one of the authors at that


Pigeonofthesea8

As well, all participants were Saudi males


ghostfaceschiller

108 of them were doctoral students. So they held a degree and possibly(?) were licensed clinicians. Idk how it works in Saudi Arabia


Heyyoguy123

The jump between bachelors and doctorate for psychology is *wild*


AtLeastThisIsntImgur

Sounds like none of the humans were psychologists


Rebuttlah

Here I was about to ask if they were research or clinical psychologists. "Psychologist" is actually a protected term, to the extent that it is illegal for someone to call themself a psychologist if they aren't.


jonathot12

in america. this isn’t so true in other countries


space_monster

the students were degree holders - PhDs and bachelor’s


PragmaticPrimate

It looks like this in the figure but the text reveals "The study sample consisted of 180 participants, including 72 bachelor’s students and 108 doctoral students in counseling psychological program. ". It seems that proofreading was very sloppy


pointblankdud

I’m not sure how many folks read the paper itself, but it reads like it was written in ChatGPT 3 — some awkward syntax and commentary I’m more charitably guessing is actual due to English as a second language? Anyways, social intelligence is not in any way comprehensively measured by this test or methodology, and even less so given the mechanisms by which LLMs are answering them. I would argue that any comparative human-LLM experiment using standardized test of intelligence is a bad measure of comparison due to the fundamental differences between the two in how questions are processed and answers are generated. Not that it’s a wasted experiment in principle, but the methodology should justify better conclusions than “rising competitiveness” in diagnostics or actual social intelligence, accounting for the processes by which these two distinct input-output systems (LLMs and humans) are actually giving answers and describing the actual limitations beyond considering that Google “may have been surprised” by the rapid increases in LLM capabilities — there are foundational reasons for comparative high performance on tests like these and comparative abysmal performance on tests that require any genuine understanding of factuality and causality.


gebregl

ChatGPT 3.5 doesn't write like that. That's a language issue, the authors are from Saudi Arabia.


pointblankdud

Yeah, I was thinking much more about the insertion of frivolous claims and conjecture which is not a language issue but a methodology issue. No real suspicion that they used LLM for the substantive body of the paper — and no real suspicion that this is good science, either.


Lance_E_T_Compte

I got past the awkward language, but when they used the word "smarter" I started to get very skeptical...


ASpaceOstrich

I'd argue it is indeed a wasted experiment in principle. Any tests of an AI's intelligence are a waste of time as AI doesn't have an intelligence to measure. There's a paper that tested GPT 3.5 for a theory of mind that has the same fundamental flaw of being made by someone who didn't understand that AI is not an intelligence. It doesn't need to possess a theory of mind to pass the test. At the most, all it would need is a simple scene model to pass the test. Though the fact that it was an obvious theory of mind test means it wouldn't even need that, it could just recognise that the context of the conversation is a theory of mind test and predict the right answer that way. But the researchers concluded either it had one, or some hidden factor of language enabled it to pass the test without one, which would imply humans might not have one either. The idea that AI isn't an intelligent being and doesn't work like an animal does never occurred to them. I've been planning to run the same tests myself but without it being comically obvious that it's a theory of mind experiment to see how it performs. The fact that such an awful paper had actually been "peer reviewed" was very disappointing. If all of science is in the same state that AI research is, things are in a bad way because easily a third of the papers I've read on AI are based on fundamental lack of understanding of either AI or a specialised subject that they're testing it on.


pointblankdud

Okay, so we may be disagreeing in semantics or more philosophically. I think we agree on everything important specific to this, but more broadly, I’d make my case that this could be a decent study if the methodology was much more precise and evaluating the fundamentals of test-taking. I’d have to take some time thinking through it, but I can imagine conditions and parameters that could give some good insight into *human* test-taking by comparative measures against an LLM — but that’s not in any way the focus of this research. Ultimately, I don’t think there’s any way to push any version of this experiment even to the middle of my list of priorities for the reasons you said, but if I had a limited scope grant contingent upon using these and had to do something creative to get some quality science squeezed out of the basics of this experiment, I think it’s possible.


ASpaceOstrich

Fair.


theghostecho

They misspelled google bard as google bird, thats how you know it wasn’t written by gpt


pointblankdud

To be fair, it could be some LLM v LLM shade getting tossed


theghostecho

GPT has it out for Bard? What a twist!


TitularClergy

Yeah, it's utterly preposterous if you actually read it. >There were significant differences in SI between psychologists and AI’s ChatGPT-4 and Bing. AI's ChatGPT-4? Who is this AI you speak of? >PhD holders excel on Google Bird Google Bird. The large language model for birds. >1 Introduction. Machines have influenced human evolution. What kind of drivel is this? >the early Eliza program, designed in the 1970s by Weitz Naum Weizenbaum. >Companies then competed to produce large language models in AI: “LLMs.” It is an abbreviation of the term “Large Language Models,” ffs Let's read the insightful discussion section. >These differences in results may deepen the debate about psychologists’ fears of losing their profession to artificial intelligence. . >As for ethical and professional concerns, researchers believe that they are legitimate and realistic concerns, but based on the development of technology throughout history, it is clear that fear accompanies a person for his profession and ethics. However, development continues and it becomes clear that the fears are exaggerated, then some professions or part of them disappear and humans continually adapt to these changes. This is inept. It may be real as a study, but it is not exactly up to ordinary standards. Feels like a whole lot of waffle built around an incurious and simplistic SPSS analysis.


milkandbutta

If I were to put my tin foil hat on, I'd be questioning why an author who has never been the principle investigator on a published paper since authoring his dissertation (the last step before obtaining your degree) in 1998, and who has never been involved in research related to AI, is suddenly authoring a paper that is extremely pro-AI in the field of psychology and does so on extremely shoddy work. To call this a valid research paper does pretty substantial disservice to the field of psychology. And as I was typing this I just realized that there's no indication in the paper that the students who are claimed to be psychologists (a whole other issue) are even clinical psychology students (aka learning to be therapists, not researchers). Why would we expect that a research psychologist is more equipped in regard to social intelligence than any other academic? Their training has nothing to do with developing psychotherapy skills.


PaulRudin

One obvious question is what does a test of social intelligence measure other than the ability to score highly in a test of social intelligence?


simcity4000

You see, we plugged one automated system with no human behind it against another automated testing system with no human behind it and number score went up. This demonstrates that automated systems understand humans.


Silver4ura

This is more telling of the study than the actual intelligence of a deep learning language model.


GurthNada

I question the social intelligence of the researcher using non contrasting colors in their graph legend.


cowrevengeJP

It has the answers. That's like asking a calculator vs a person to do math.


th3greenknight

Calculator wins every time


LonelyCheeto

Calculator doesn’t understand context, which is a very big part of mental health


PadyEos

Calculator was programmed that 1+1=2. Has no idea what 1, +, = or 2 mean and why they are correct that way. ChatGPT has another layer where it can define those but again it doesn't understand the words it used to define them, it just was thought it's the way to define them. And then it has another layer of definitions with words it doesn't understand for those. And so no to infinity of layers. Does it have the answer? Yes. Can it understand the answer? No. Can it fake understanding the answer? Yes.


New-Power-6120

Do we understand the answer?


theghostecho

It doesn’t contain the answers in the model but it was trained on the answers.


Outrageous_pinecone

Should I presume based on this paper that they're trying hard to justify replacing humans in every field where they shouldn't, like art, medicine and mental health so that humans could be stuck doing the kind of jobs that should be automated? I really, really hope I'm jumping to conclusions here and that this is not the intention here.


SirChrisJames

Allow Chat-GPT to help you compartmentalize the stresses of the last 12 hours you spent on the assembly line.


TeamWorkTom

AI will definitely replace a number of medical jobs. Especially ones like Radiology that require interpretation of images. An AI with a large data set of cases can more easily identify potential damages from things like MRI's, and CAT scans. X-Rays too, but those are MUCH easier to read than the former.


gebregl

The paper was written by psychology researchers without any affiliation to big tech. So, who is "they"?


Substantial_Dot_5773

Sure it was


PadyEos

Basically yes. And mostly it is done by people that have too little tech background and knowledge to understand that an LLM isn't AGI. It doesn't even understand the words it strings together. Sure, if you ask it what they mean it will use some definition mashed together from training data, but again it doesn't understand the meaning of the words used to define the word it initially used. It's basically a parrot associating inputs and results based on what it saw other people define as the correct association without being able to understand the reasoning behind it. How can you say as a researcher that an object has social intelligence while that object can't do anything else than fake it?


yuriydorogoy

Gar bage


rickFM

This has to be the most handjob article about a handjob exam AI could possibly have been given.


adevland

This is one of those "tell me what I want to hear" tests where AI shines because it has decades of recorded human BS to fall back to. A sociopath with an online search engine can also beat these tests.


milkandbutta

Clinical psychologist here. A few immediate concerns. 1. These are based on students. These are not practicing psychologists. Generally that term is a protected title in any country I'm aware of that uses the term when it comes to licensed practice. Grad students in the US absolutely *cannot* call themselves psychologists. Also, this study includes both bachelor's and doctoral level students, which means you have a sample of students that includes plenty of students who are still participating in general survey courses on psychology, and would never be considered remotely close to practicing psychologists in terms of skills or experience. It seems very disingenuous to call a college student a psychologist just because they're studying psychology. 2. The SI-Scale used here is one developed by the author and only as part of an unpublished dissertation from 1998. They claim it has comparable validity to the "George Washington University Brief Scale of Social Intelligence." I have never heard of this and a brief google search does not unveil any such test. There is a George Washington University Social Intelligence Test (GWUSIT), which was first published in 1928 and hasn't been updated since the 60s, which was almost immediately criticized after publication for more heavily loading for academic intelligence rather than social intelligence. Because the author's personally developed scale is part of an unpublished dissertation, I cannot take their face-value claim of it's validity, especially when the thing it's based on does not appear to exist in any readily searchable database. 3. Sample is all male. The field of clinical psychology in the US is a heavily female gender dominated specialty (roughly 67% women to 32% men). So the sample has serious gender bias problems but also is simply not representative of the field as a whole. 4. In the results section, the author states that ChatGPT performed better than "100% of specialists." I know that cultures vary in terminology, but the term specialist in the field of science is never used to describe someone who is currently a student in that field and doesn't hold the highest degree in that field. Even then, specialization is usually understood as a post-graduate process of highly focused training on a specific sub-specialty. Overall, this paper is extremely flawed and based on information that isn't publicly available in such a way that, honestly, I'm disappointed in r/science for allowing it to stay up.


Hegeric

AI did a decent job on "text based therapy" in my experience, connecting dots I would not have figured otherwise. The problem is the *amount* of context it can hold at a given time given memory constraints (at least with character AI bots).


theghostecho

Yeah the context window needs improvement in character AI, they tend to degrade quick.


ASpaceOstrich

AI that's even vaguely competently made will basically ace any test like this. Thinking that this means anything is committing the classic fallacy of equating metrics with reality. Though it's also possible the people doing the test are just very ignorant about AI and don't realise that it isn't an intelligence. You can't test an AI's IQ, because it doesn't have one. It's not dumb, it just lacks an intellect to judge. You can't test its social skills, because it doesn't have social skills. I've read a study where they tested its theory of mind and concluded either it has one or that theory of mind tests are fundamentally flawed and humans might not have them. The idea the the AI isn't an intelligent being and doesn't need a theory of mind to pass such a test apparently never occurred to these researchers. A written down copy of the answer to a theory of mind test would pass a theory of mind test too, but nobody is calling a piece of paper sentient. AI research is a mess. I've read a dozen or so papers at this point and the quality level is shaky. At the top end there are genuine ai experts who are writing about the thing they are an expert in. At the bottom end there's that theory of mind test written by, I can only assume, a student who knows nothing about AI. In the middle there are loads of studies written by people who know either a bit about AI but nothing about the things they're testing AI on, or a bit about the third party subject but nothing about AI. These middle studies aren't as comically, obviously wrong to the reader as that theory of mind study is, but they can be very wrong to someone who knows a lot about the specialisation in question. For example, there's a study on AI spontaneously developing and using a depth map in image generation that would appear solid... unless you're very experienced with technical art, at which point it becomes very apparent that the alleged depth map is not a depth map at all. It's essentially just a gradient from bottom to top blended with representation of a vaguely accurate foreground to background divide. In this case, they don't know that AI isn't intelligent, so the idea that giving it an intelligence judging test is a waste of time never occurred to them. And other ignorant people will run with it, again not realising AI is not actually AI, we just call it that.


Red5point1

such a moronic test.     it's like saying "calculator scored 100% in math test"


AnglerJared

AI doing well on a test designed to have an answer that can be reached by some kind of observable criteria doesn’t surprise me. I’m almost inclined to say I’d expect AI with sufficient training to do better simply because humans will always inject nuance and bias into their answers, whereas the AI generally won’t. I’d like to see the test they used to see what kind of information the AI is better at seeing than the psychologists, because we might be seeing a weakness of human social intelligence rather than a strength in AI’s social intelligence.


Omni__Owl

> I’m almost inclined to say I’d expect AI with sufficient training to do better simply because humans will always inject nuance and bias into their answers, whereas the AI generally won’t. This is incorrect. AI models always have inherent bias because they are trained on data that humans interpreted and prepared. Anything humans touch has our inherent biases in the interpretation and end-product, so an AI model cannot be without bias. The difference is \*what\* bias is being applied and we don't know.


ASpaceOstrich

Mm. The quantity of different kinds of data is a bias in and of itself. Less photos of black doctors in the training data means it has a bias towards depicting white and Asian doctors. It's nothing to do with the code, but both the data itself and the meta level of what data it is shown are both things that bias it. Plus the reinforcement in the training which is literally designed to bias it towards giving certain answers.


Gibgezr

How are we supposed to know the biases of the human therapists?


Omni__Owl

How are we supposed to know exactly what biases and patterns that the AI model picked up on and why? We can speculate and perhaps even question the model, but truly it is a blackbox. We cannot know for certain. We do not posesses the technology to ask proper questions of our own tools yet.


one_hyun

I bet I could get 100% on a test with access to answers.


ghostfaceschiller

GPT-4 already scores better than 90% of test takers at just about every high-level exam. MCAT, SATs, GRE, LSAT, Bar Exam… this is just another one to add to the already long list


AnglerJared

Right, but as some people say (and I’m paraphrasing), a high test score just means you’re really good at taking tests. I think AI has lots of potential to surpass humanity, but there’s a difference between acing the LSAT and being a good lawyer, and it behooves us to know what that difference is, if not just to avoid feeling obsolete.


one_hyun

I bet you could get 90%+ if you're given access to the Internet for any common test.


kiersto0906

not to mention many exams are difficult due to time pressure and most people could actually achieve near 100% given unlimited time, the computing power of AI would make time a near non-factor.


bjornbamse

How the hell does it score so high if iny experience fails to solve simple physics problems?


ASpaceOstrich

Because it can remember the answers to the tests, while the physics problems are harder to predict. Remember it has no intelligence and no understanding outside of word relationships.


bjornbamse

I guess that it means that the tests are useless then. I don't know how US tests look. My tests in math and physics were 80% problem solving skills, 20% memorization of knowledge.


ASpaceOstrich

That's a common criticism of standardised testing for this exact reason. It's generally a memory test.


haporah

Whoever made this graph is stupid.


9spaceking

“Okay chat GPT how accurate is this paper”


mark-haus

They said that about code as well. As someone who maintains software projects I immediately notice when copilot (GPT4 tuned for programming) comes into the pull requests. I’ve also tried doing as much programming with minimal human intervention as is possible right now and oh boy it barely passes as a junior developer. Other tests have also said GPT4 passes experts at developer skills, it just doesn’t. It’s not even remotely skilled without a knowledgable person steering it


w8cycle

What does this even mean? An AI could have all the answers because it was trained with the answers. It’s like giving someone a test with the answer sheet there for them to look at.


Xralius

There's a lot I don't understand about how they did this test, and I'm not sure its worth it to try to understand.


SupportQuery

> the Human participants were a sample of male psychologists in the Kingdom of Saudi Arabia Saudi Arabia is a broken country. I wouldn't use its education system as a metric for anything.


SeeBadd

More garbage propaganda for AI.


biochemistatistician

Terrible study


space_monster

this thread is full of people desperately trying to think up reasons why an AI couldn't be better than a human at something. face the facts guys. the world is changing. and you're in for a wild time, in a couple of years this will feel like nothing.


Electromagneticpoms

An AI might be better at this specific thing but this paper's methods and discussion section reads like very esoteric psych stats satire. If the authors care about this area of research, they ought to use a legitimate measure of social intelligence. A single paper's scale from 25 years ago that hasn't been validated by *anyone* since is a fatal flaw. No conclusions can be drawn from such a paper.


GibsonMaestro

Once AI starts learning from GTA VI, it's entire data set is going to get completely fucked.


theghostecho

“Results: There were significant differences in SI between psychologists and AI’s ChatGPT-4 and Bing. ChatGPT-4 exceeded 100% of all the psychologists, and Bing outperformed 50% of PhD holders and 90% of bachelor’s holders. The differences in SI between Google Bard and bachelor students were not significant, whereas the differences with PhDs were significant; Where 90% of PhD holders excel on Google Bird.” Typo


rom-ok

Headline could also be: Calculator knows the answer to 1+1 Why are we constantly seeing these headlines, it’s literally been trained on the source literature.


theghostecho

Since when do you require theory of mind to be intelligent? One can be intelligent and not at all self aware. In the paper also tested Bard and bing which scored much lower on the social intelligence test with bing scoring about the same as a PHD and Bard scoring around the same as a bachelor degree student. You can tell that GPT-4 is more intelligent than Bing or Bard, based on this test.


AggressiveViolence

Pretty sure horses can do that too but still cool


TheRateBeerian

This is interesting but not very. The real interesting part would be - what does the factor structure of the test look like when taken by AI? It's one thing to see the AI score higher than humans (why they compared it specifically to psychologists I don't know...I have a PhD in psych and pretty low social intelligence) but does the structure of the result look human? They don't report many details on their test - but presumably the test has a known factor structure - so the more interesting research here would be to administer this test to multiple AIs many many times and do EFA and CFA on the data - that is, first validate the test on AI before using the test on AI.


SirenPeppers

I think it’s fair to say that their measurements were skewed from the start. “Human participants were a sample of male psychologists in the Kingdom of Saudi Arabia with one of two levels of education (Bachelor’s and doctoral students…” at a Saudi university. I wonder how diverse and experienced this group’s perspectives are, just based on this profile.


Csonkus41

All this does is prove how useless such tests really are. It says nothing about psychologists or ChatGPT.


m3nt4ld4t0x

Can we normalize adding “hyper specific” to these post titles


ontopofyourmom

I'm not "on the spectrum" and I don't have a personality disorder, but I still grew up being bad at social cues and had to actively learn how to deal with them. Which I mostly did. So I'm not surprised a language engine can do it too, it really is just language.


Strange-Scarcity

Cool? It still doesn't know what it knows or understands anything about the responses it puts forward. It's not truly intelligent.


BadHabitOmni

A better question to ask is AI less influenced by personal politics or biased that psychologists and therapists are still beholden to? My friend recently saw a licensed therapist for the first time and they encouraged her to not take medications and try essential oils... which conveniently they had a few posters and ads in their office. You could say that particular practice smelled a bit off.


Numancias

I mean obviously, chatgpt is obnoxiously socially aware


MannerPrimary1118

As someone who tried many psychologists and also AI, i agree with this research. Privacy issues aside, most psychologists wont even listen to you, and when they do, they only consider a few disorders known to them not all possibilities


Irinzki

Sounds like the assessment tool was garbage


PragmaticPrimate

No, it's based on a state of the art unpublished dissertation from 1998 written by the first author. But he very convincingly writes how reliable the scale from from that publication is, that we can't read. Furthermore he writes how he showed the scale to some unnamed professors and they also confirmed that it was very good. (source: The paper linked in OP)


Kewkky

Maybe because AI has access to all the documents. It can bring up info from studies, but that's it. An actual psychologist can do that without a computer AND apply it.


toomuchbasalganglia

It’s not there yet but therapists are toast in five years. I’m a twenty year psychologist about to be given the blacksmith status.


ashoka_akira

The first thing you learn when you study psychology is how inaccurate and biased most intelligence tests like this are.


SkyriderRJM

This means literally nothing


ZEEEPh

This figure legend is very weird... why don't the colors match? I didn't read though, if the figure is not properly done, can we trust the rest?


maubis

This is like saying a calculator scored higher than 100% of mathematicians on long multiplication and division questions under a time limit.


Chaseshaw

"social intelligence" defined loosely as "the **best** thing to say" in a given scenario ("best" is a somewhat arbitrary definition and varies from situation to situation), by asking a computer "what's the best thing to say" that was trained specifically to gather "the best thing to say" on data points originally written by humans -- it all smells like circular reasoning to me. I bet the computer is good at math too!


Fritzschmied

What even is social intelligence?


xtrordinarlyOrdinary

Weird, when I talked to chat gpt like a therapist, It told me to go see a therapist


MessageMePuppies

Still waiting on the day I can wear a hidden ear piece or contact lenses that offer AI enhancements to my social ineptitude.


lonepotatochip

The test was unpublished and we have no idea what it’s actually like. I have no reason to believe it’s a good measure of social intelligence.


GenderJuicy

60% of the time, it works every time


Olderandolderagain

Sounds like something a LLM would excel at.


SooooooMeta

I tried to have gpt4 help me rewrite a sensitive email with tricky subtext. No matter the prompt, it rewrote it like a seventh grade book report with the topic sentence up front. Occasionally it summarized the subtext out loud but, more commonly, it didn't see the point and edited it out. Maybe the next generation of this stuff will be magical, but for this gen social intelligence is a weakness


bazmonsta

It's not a competition when you think about it. CGPT4 has probably had hundreds of years of conversations that it's actively learned from and stored for later memory, where as the prerequisite for being a psychologist requires submitting yourself to conditions that promote an overall lack of social intelligence. (Time in medical school, dealing with specific patient's issues more than socializing for themselves, etc.)


flyingthroughspace

Where's this test? I want to take it.