Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

OP here. Thanks for the feedback! Yeah, those are some pretty tough sentences (obviously not in the dataset). The tongue twister one is really interesting.


sort by: page size:

All the text in the post reads like an easy exercise in linguistics. I would not be surprised if there are hundreds of these in the training texts.

That's great. We have really struggled with diarization. None of the systems out there actually work! We get close but still mess it up from time to time.

Here are the results on the other files.

  4333.mp3: Oh yeah, it's still pretty tight, though. It's very challenging. They actually pull everything out of your.

  7510.mp3: Demons on TV like that. And for people to expose themselves to being rejected on TV or humiliated by fear factor or.

  8036.mp3: Well, I feel like as far as... as far as cursing and language, because I feel like as long as it's not necessarily in context, but.

  8522.mp3: Stuff to you, so you don't have to spend any body. He.

The last set of samples includes comparisons with the source human. The only one that I can guess with confidence is the last pair. (Sample 1 of the pair contains assumed-context emotional inflection.)

A propos of [1]

1. Item 3: The ocean is full of floating objects, and it would be hard to see the duck among them? 2. Item 2: is structured as non sequitur, takes a long time because there are many hazards?

I am impressed that you find it impressive. It is plausible-sounding, and I find that disturbing, but it is not useful (and the text prediction paradigm seems a dead end in terms of formulating anything more than plausible sounding)


"Also true story I generated this data and actually wrote a for-loop that would print out the top ten most highly-correlated words from 10 to 1 and pause in between each word to build suspense (for me alone at my computer)."

Adorable.

I am curious why she would use word length and exclamation/emoji/question mark ratios but not check for spelling or punctuation? Surely they are more indicative of someone's level of education and reading habits.


This is actually really impressive. I tried a few naughty-ish words of course. 'Diarrhea' in particular gave me a chuckle for its accuracy.

I think they were also randomly selected from the list of all possible english sentences.

Eh, I think as "human evaluated" metrics go, it's a decent test of how well it can parse a reasonably complex sentence and reply accurately.

For me:

GPT4 3/3: I couldn't resist the temptation to take a bite of the juicy, red apple. Her favorite fruit was not a pear, nor an orange, but an apple. When asked what type of tree to plant in our garden, we unanimously agreed on an apple.

GPT3.5 2/3: "After a long day of hiking, I sat under the shade of an apple tree, relishing the sweet crunch of a freshly picked apple." "As autumn approached, the air filled with the irresistible aroma of warm apple pie baking in the oven, teasing my taste buds." "The teacher asked the students to name a fruit that starts with the letter 'A,' and the eager student proudly exclaimed, 'Apple!'"

Bard 0/3: Sure, here are three sentences ending in the word "apple": I ate an apple for breakfast.The apple tree is in bloom. The apple pie was delicious. Is there anything else I can help you with?

Bard definitely seems to fumble the hardest, it's pretty funny how it brackets the response too. "Here's three sentences ending with the word apple!" nope.

Edit: Interesting enough, Bard seems to outperform GPT3.5 and at least match 4 on my pet test prompt, asking it "What’s that Dante quote that goes something like “before me there were no something, and only something something." 3.5 struggled to find it, 4 finds it relatively quickly, Bard initially told me that quote isn't in the poem but when I reiterated I couldn't remember the whole thing it found it immediately and sourced the right translation. It answered as if it were reading out of a specific translation too - "The source I used was..." Is there agent behavior under the hood of bard or is just how the model is trained to communicate?


> What’s the text version of a birthday cake?

"Happy birthday!"

> More challenging: what’s the text version of a singing, eye-rolling T-Rex?

Ok, you got me there, because I have no idea what concept is even meant to be communicated by such an absurd thing.


Thanks for following up!

I guess what happened is a user of your system tried to be funny. In this utterance there are 3 short fully grammatical sentences. And a 4th one which is not very grammatical but fully understandable, commanding the device to clean the carpet :)

- Cleaning is good.

- Dust is so bad.

- Make your miracle.

- Cleanly my carpet. [1]

[1] original is an ungrammatical imperative

Some (more) human review would have been needed.

The Finnish samples are full of very weird utterances, too. Some like people might have written over 50 years ago, but nobody would speak like this.

Maybe they were reviewed on Mechanical turk with moderate requirements / payment? Well, you get what you pay for...


A couple more I noticed:

"its a terrible habit" -> it's

"thats been hand-picked by a human" -> that's


That was brilliant. It took a while before I could see any that were 'improvised' words. "They took are (steve)jobs" with 'are' covering 'steve'.

I am actually surprised with how well it all worked. Other than obvious cases of people just seeing how annoying they could be in general you could get a sentence going with input from others.

Gonna bookmark this so I can come back to play later.


Wow, some of these are really unique and would be EASILY missed by simpler string matching. I had to read some out-loud to understand

Faves:

> YOU PUT THE OUGHT IN WATERCOLORS.

> YOU PUT THE REITAN IN FRIGHTENED.

> YOU PUT THE EFFICIENTLY IN INEFFICIENTLY . > YOU PUT THE BORED IN DEBORD.

> YOU PUT THE FARAH IN PHEROMONES. (!!?)


> Does anyone know what the codenames are like? If they are easy enough to remember, then they may be easy enough to brute-force?

I don't know what they're like, but if you take a list of 5000 common words and use 4 random entries for each codename, there are 625,000,000,000,000 possible combinations. Brute-forcing the entire space at 100,000 tries per second would take ~200 years.

Edit: I made a toy jsfiddle version: http://jsfiddle.net/SwWZ9/10/

The wordlist is just a random sampling of English nouns (I couldn't find a quick source of common nouns long enough). It may contain profanity, watch out!


I actually picked sentences which were wrong.

"My thesaurus is terrible. It's also terrible."

— Siri's response when I asked her "Tell me a joke" last night.


It seems most of these are the result of a poor application of a thesaurus with no regard to context, but here are some tortured phrase gems from these "gobbledygook sandwiches" [0]:

"artificial intelligence" => "counterfeit consciousness" / "man-made brainpower" / "fake knowledge"

"mean square error" => "mean square blunder"

"sensitive data" => "touchy information"

"signal to noise" => "flag to clamor"

"breast cancer" => "bosom peril"

"big data" => "huge information"

"ant colony" => "underground creepy crawly region"

"Navier-Stokes" => "Navier-Stocks"

"NP-hard" => "NP-difficult"

"end-users" => "stop-customers"

"phising attack" => "phishing assault"

"emission of CO2" => "excretion of CO2"

"deep learning" => "profound education"

"decision tree" => "choice bush"

"system failure" => "framework disappointment"

"real time" => "genuine time"

"fuzzy logic" => "feathery rationale"

"child nodes" => "tyke hubs"

"state-of-the-art" => "United States of America-of-the-cleverness"

"directional (graph) axes" => "directional tomahawks"

"magic mushrooms" => "wizardry mushrooms"

"max pooling" => "Georgia home boy pooling" (!?)

"malicious parties" => "compromising get-togethers"

[0] https://dbrech.irit.fr/pls/apex/f?p=9999:5


>So you think the chance of human beings to come up with 4 random words is pretty low?

I've always wondered how effective the random words thing is. sure, there are like 100k english words in current use according to google, but it seems like a list of the most common few hundred of those words would crack a lot of passwords.


A+ wordplay in the title of the actual survey.
next

Legal | privacy