ioc.exchange is one of the many independent Mastodon servers you can use to participate in the fediverse.
INDICATORS OF COMPROMISE (IOC) InfoSec Community within the Fediverse. Newbies, experts, gurus - Everyone is Welcome! Instance is supposed to be fast and secure.

Administered by:

Server stats:

1.3K
active users

I like how we took something computers were masters at doing, and somehow fucked it up.

@oli I get the right answer when I try. Same inputs.

@jesusmargar @oli and this is one of the problems with LLMs—they’re inherently stochastic

Kevin Riggle

@jesusmargar @oli the inability to create reproducible test cases for these systems is an enormous problem for our ability to integrate them into other systems

@kevinriggle @oli surely they depend on a seed which depends on time?

@jesusmargar @oli somewhere. And sometimes it’s possible to start it from a known and fixed constant, and get the same results for the same prompt every time. (You can do this with some of the image generation models and invokeai iirc.) But in larger systems and longer interactions even with a fixed PRNG seed the path taken through the PRNG space matters, and small perturbations in it can create large changes in outcome

@jesusmargar @oli (ask unrelated questions A and B in that order, get good answers A’ and B’. Ask them in the order B and A, get the complete text of Atlas Shrugged)

@jesusmargar @oli there’s some feedback loop missing, and these systems diverge rather than converge

@kevinriggle @jesusmargar @oli which is great if you want something to create unique elevator music or wallpaper, and terrible for virtually everything else

@kevinriggle

That was fun to read. I literally lol'ed.

@kevinriggle @jesusmargar @oli we’ve been told we should create ‘plausibility tests’ that use a (different?) llm to determine whether the test result is fit for purpose. also, fuck that.

@airshipper @kevinriggle @oli perhaps the problem is to expect deterministic behaviour rather than some degree of inexactness. I mean, I wouldn't use it to make final decisions on cancer treatment, for instance, but maybe it's ok to polish a text that isn't too important.

@jesusmargar @kevinriggle @oli i would use it to generate waste heat from the exchange of tokens, after shifting a sizable chunk of our engineering budget from salaries to services sigh

@jesusmargar @airshipper @kevinriggle @oli The problem is that, using your cancer example, they tried to pair it with a doctor looking at its determinations to help them decide if a skin lump was cancer or not - and it turned out that the doctor's rarely would correct the LLM, even when the LLM is intentionally wrong.

@AT1ST @airshipper @kevinriggle @oli mmh, in which country? In the UK follow this method and doctors correct all the time when wrong. What the doctor does is using the output to decide on diagnostic test to use.

@jesusmargar

"maybe it's ok to polish a text that isn't too important" - My feeling is that if the text isn't too important, it doesn't need much polishing, and a human should do any polishing necessary anyway. Then later when the human has to polish text that is absolutely critical to get right, the human has had practice at polishing and does it well.

@airshipper @kevinriggle @oli

@lady_alys @airshipper @kevinriggle @oli I guess you don't need to write in a foreign language for work. For those of us without that privilege LLMs can help us level up!

@jesusmargar
Correct, I don't need to write in a foreign language. Is translation to another language what you meant by "to polish a text"?

@lady_alys no. I am not a English native speaker. I write in English at work. Sometimes (very occasionally) I ask it to correct grammar in some paragraphs or reduce the length of a given text. It does a pretty good job at that.

@jesusmargar this is the use of generative ai that i have the most sympathy for, because ‘knowledge work’ in a second language is hard.

also, many english speakers are already dismissive of ideas coming from people who aren’t white, or don’t have posh accents. being able to write well is a good counter that.

@airshipper absolutely, and often it's all prejudice. When I lived in US people were friendly with me and thought of me as French in the street because of my accent (which is Spanish). However I was once was criticised in student evaluations for my 'South American accent'. My accent isn't close to South American at all. The difference is that the student knew my name, which is Spanish, and assumed that made me South American. Suddenly, that piece of information made my accent less desirable

@airshipper if I had had darker skin I'm sure I'd have encountered more such comments.

There are problems with GenAI but there are also very legitimate uses. I'd like if it could offer alt text here, for instance. It often gets it right and it would make alt-less images unlikely to happen. I sometimes think people in this site are a bit too luddite to my taste, refusing any use of the technology due to some bad consequences of very specific uses.

@kevinriggle @jesusmargar @oli I'm surprised how inconsistent it is actually. I just tried five times on chatgpt.com (not logged in).
* Wrong
* Wrong but corrected itself at the end
* Wrong but corrected itself at the end
* Correct
* Wrong but corrected itself at the end

Always with slightly different wording. AI is weird.