1. NWOPD

    NWOPD Administrator

    Joined:
    Oct 16, 2020
    Messages:
    272
    Likes Received:
    330

    linguistic analysis

    Discussion in 'Research' started by NWOPD, May 1, 2022.

    I was thinking about the degree were tracked and traceable in the Information Age, and it got me thinking about linguistic analysis.

    The technology certainly exists that would enable those who possess it to identify you with a high degree of accuracy as long as they have a sample of your writing, which is pretty much everybody.

    It got me thinking, how would an at-risk journalist, whistleblower, dissident, or modern Satoshi Nakamoto avoid linguistic analysis?

    I’m still researching the subject, but I think it relies on frequency of certain words, diversity of vocabulary, nation origin of words used, and perhaps inclusion of figures of speech.
     
    MartinM and Cdn Writer like this.
  2. Bruce Johnson

    Bruce Johnson Contributor Contributor Contest Winner 2023

    Joined:
    Jan 9, 2021
    Messages:
    1,340
    Likes Received:
    959
    Maybe try and use some AI to 'translate' it. But to me when you use AI in this manner, like in writing an article or answering a question, it's really (at least based on the examples I've seen) not a really creative AI, but instead, it's the summation of everyone's knowledge. But they will only get better.
     
    Cdn Writer likes this.
  3. NWOPD

    NWOPD Administrator

    Joined:
    Oct 16, 2020
    Messages:
    272
    Likes Received:
    330
    But I’d you used an AI to translate/mask your original words, they could theoretically reverse the process, sort of like applying a reverse algorithm to undo the AI translation. I guess you could use multiple AI translators to abstract the original words, ie modify with an AI, translate to Russian, back to English, but then you risk losing the clarity of message from the original writing.

    If you wanted to preserve the essence of the original message, I think the dissident would have to be very conscious about his selection of words not just to convey his message, but also knowing linguistic analysis would be used against him.

    I guess I need to figure out how actual linguistic analysis works, and how word selection and punctuation act as digital fingerprints.
     
    Cdn Writer likes this.
  4. SapereAude

    SapereAude Contributor Contributor

    Joined:
    Jan 21, 2021
    Messages:
    1,714
    Likes Received:
    1,359
    There's more to linguistics than word selection and punctuation. In the strictest sense, linguistics is the study of verbal communication (speech), not written language.

    Simple way to defeat analysis: write in your native language. Translate into a second language using Google translator. Then use Bing translator to change it back to the original language. Try it with a few paragraphs of your own writing -- I think you'll be amused by the results.
     
    Cdn Writer likes this.
  5. Xoic

    Xoic Prognosticator of Arcana Ridiculosum Contributor Blogerator

    Joined:
    Dec 24, 2019
    Messages:
    12,460
    Likes Received:
    13,503
    Location:
    Way, way out there
    Funniest example of this I've seen:

    The flesh is willing translated into (something Asian I believe) and back comes out The meat wishes it so. You end up with something as incomprehensible as those instruction pamphlets included with Asian electronics or Swedish furniture.
     
    Cdn Writer likes this.
  6. Xoic

    Xoic Prognosticator of Arcana Ridiculosum Contributor Blogerator

    Joined:
    Dec 24, 2019
    Messages:
    12,460
    Likes Received:
    13,503
    Location:
    Way, way out there
    A different approach—use some kind of code so only people with the key can decypher it. I guess that wouldn't work though if you're talking about books or leaflets or something you want the public to see and understand.

    Maybe something like the digital equivalent of cutting letters out of newspapers? Cut and paste groups of words from all kinds of documents, finagling and editing where necessary to make it work. You should end up with something very different from your own writing style, except in those areas where you had to do extensive re-writing. They'd need to develop an algorithm that can figure out where you did that (maybe by locating all your source material and finding where you deviated from it significantly) and then glean out the parts written mostly by you.

    Or maybe dictate to someone but have them write it up in their own words. Or do the above and have somebody else edit it for meaning, so your hand isn't in the mix anywhere.
     
    Last edited: May 1, 2022
    Cdn Writer likes this.
  7. MartinM

    MartinM Banned

    Joined:
    Sep 1, 2020
    Messages:
    225
    Likes Received:
    205
    Location:
    Hong Kong
    @NWOPD

    OK I live in Hong Kong which is starting to see some different changes, how the population can express its self. This statement as I understand it is wrong or misses the point....

    “It got me thinking, how would an at-risk journalist, whistleblower, dissident, or modern Satoshi Nakamoto avoid linguistic analysis?”

    Any Journalist doesn’t want to avoid linguistic analysist, but may want to hide the location of his posting. The major issue is trust in any source material. Although I am a born and bred Yorkshireman, I get pissed off at the skewed reporting of Hong Kong affairs by The Guardian in the British press. However, articles of real reporting of incidents can cause problems with local Government but do show a real picture of life. That Journalist needs his words told as is and not skewed. His location kept hidden.

    What you are talking about is AI config of a structured story telling with a skew. This already occurs and can be found as easily as looking through the msn news drop on your browser. An at-risk Journo or whistleblower needs to show the story is real. By rinsing it through a translator actually detracts and devalues the story’s worth. I can tell reading a Guardian article that the Journalist doesn’t live in Hong Kong and hasn’t read the feel of the average local that lives here. However, that story will influence readers.

    The information age as corrupted the peoples trust of genuine reported unbiased news. A unique writing style acts like a signature on a piece of work and enforces its authenticity. By washing the story through a translator and grammar checker loses its impact.

    Take my view with a pinch of salt…


    MartinM
     
    petra4 and Xoic like this.
  8. Joe_Hall

    Joe_Hall I drink Scotch and I write things

    Joined:
    Apr 20, 2021
    Messages:
    469
    Likes Received:
    497
    I think too it can be modified by the amount of time you spend in one location. I have lived all over the continental US and internationally and have picked up quite a vocabulary and grammar syntax that makes folks back home in Michigan scratch their head.
     
  9. Xoic

    Xoic Prognosticator of Arcana Ridiculosum Contributor Blogerator

    Joined:
    Dec 24, 2019
    Messages:
    12,460
    Likes Received:
    13,503
    Location:
    Way, way out there
    Fixed
     
    MartinM likes this.
  10. Robert Musil

    Robert Musil Comparativist Contributor

    Joined:
    Sep 23, 2015
    Messages:
    1,219
    Likes Received:
    1,387
    Location:
    USA
    I think I agree with @MartinM , if I understand their point correctly. In a setting where your writing might call down the law on your head, you can have security, or you can get your writing out to the public, but not both. And that's true even without any fancy linguistic analysis. Where do you think street names of drugs came from? That's why there are so many, and why they are constantly changing. The volume of street names is a form of jamming, and constantly changing them is what you would do with any cipher. Both of these make it harder for the authorities to figure out what you're talking about and surveil you.

    The tradeoff of course is that then you can't market your drugs to a very broad swath of the public, because they won't know what you're even talking about. But I guess in that calculation it's better than getting arrested...

    People have been using creative writing to try and evade surveillance for probably as long as there's been writing. It's a constant arms race, without end. Every time a new surveillance technology comes along, people figure out a way to defeat it, only for another technology to come along.
     
    MartinM likes this.
  11. ABeaujolais

    ABeaujolais Member

    Joined:
    Nov 6, 2021
    Messages:
    56
    Likes Received:
    66
    Short rant. Artificial intelligence these days is neither artificial nor intelligence. It's real stupidity. Large companies are increasingly using so-called AI to provide so-called customer service, with more companies every day running their customer services departments with no real person available to help customers. What used to be a call center with 1,000 customer service representatives is now three tech nerds and a company cat. Why would they care?

    Many years ago I read a fascinating book about regional dialects in the U.S. The author claimed that someone with knowledge and experience in regional dialect can place a person within 50 miles of where they were born with a high degree of accuracy. Having lived in a few different placed in the country, I believe the premise. There are certain words and phrases that are uniquely connected with a geographical region.

    In my career I worked with technical publications, not creative writing. A characteristic of a well-written technical publication is to make everything in the publication look like it was written by the same person. The writing process and style was formalized and practiced to avoid personalized style.

    I believe it would be nearly impossible for a individual writer to mask their individualized dialect since the clues given are not given consciously by the writer. If I was trying to mask my writing to avoid being tied to any trait, I would find at least one other person to work with. The desired result would be dry, technical, and concise, written in a consistently stylebook manner.
     
  12. Alcove Audio

    Alcove Audio Contributor Contributor

    Joined:
    Oct 26, 2021
    Messages:
    684
    Likes Received:
    348
    Doesn't this apply to the spoken word? More of an accent thing?
     
  13. ABeaujolais

    ABeaujolais Member

    Joined:
    Nov 6, 2021
    Messages:
    56
    Likes Received:
    66
    Actually, I don't recall accent as being one of the clues. Mostly it was different words used in different areas of the country to describe the same thing. I'm certainly no expert, but I have found dialect quirks in different places I've lived. In central Illinois, folks use the word "Coke" generically to describe any kind of soft drink. A soft drink as "soda" also is used in Illinois. Minnesota has several unique words and phrases. The word "borrow" is used in the opposite context from the dictionary, as in, "Will you borrow me five dollars?" Also, what was a "casserole" to me most of my life is called "hotdish" in Minnesota. Traffic lights vs. stop-and-go lights, etc.
     
  14. SapereAude

    SapereAude Contributor Contributor

    Joined:
    Jan 21, 2021
    Messages:
    1,714
    Likes Received:
    1,359
    When I took Linguistics in college it was 10 miles, not 50.

    And I believe it. There's a town that's not contiguous to my home town but probably not more than 5 miles away as the crow flies. I can almost always tell a person who grew up there after talking with them for 5 or 10 minutes.

    I suppose the radius may have expanded since I took that class. I think in general there's more innate homogeneity to society today. People move more often, there are regional high schools that draw from two or three (maybe more?) towns, magnet schools that draw from a number of surrounding communities ... all those factors contribute to watering down unique local accents and dialogues.
     
  15. SapereAude

    SapereAude Contributor Contributor

    Joined:
    Jan 21, 2021
    Messages:
    1,714
    Likes Received:
    1,359
    Yes. In the class I took in college, this was mentioned along with the factoid that (back then -- the mid-1960s) many television networks sent their news anchors to "talk school" to learn how to speak basically accent-less English. The model for this was the way English is (was) spoken in one of the mid-western states -- might have been Ohio. The theory was to have the talking heads on national television speaking in bland, accent-less English so that viewers from one region wouldn't be distracted from the news by the newscaster's accent.
     
  16. KiraAnn

    KiraAnn Senior Member

    Joined:
    May 6, 2019
    Messages:
    482
    Likes Received:
    336
    Location:
    Texas
    I believe it was Nebraska or Kansas. IIRC, Johnny Carson was considered the ideal "TV American English".

    Back on point, a decently good reporter could certainly mask their origin/location using written words. They would need to be careful, but it would not be difficult, especially if they had taken acting lessons.
     
  17. Xoic

    Xoic Prognosticator of Arcana Ridiculosum Contributor Blogerator

    Joined:
    Dec 24, 2019
    Messages:
    12,460
    Likes Received:
    13,503
    Location:
    Way, way out there
    Reminds me of the scene from The Howling where the anchorman is practicing his speech in front of the mirror in the bathroom in perfect accentless English until somebody else walks in and suddenly he's all like "Oh, Hi Bill!" in a deep southern accent. I tried to find the scene but it doesn't seem to be online or I didn't search hard enough.
     
  18. Xoic

    Xoic Prognosticator of Arcana Ridiculosum Contributor Blogerator

    Joined:
    Dec 24, 2019
    Messages:
    12,460
    Likes Received:
    13,503
    Location:
    Way, way out there
    I'm also reminded of the Midatlantic accent/dialect that was so prominent among Americans who wanted to sound posh or high-class. It was derigeur for public speakers and radio announcers, even early TV commentators. It was basically a sort of fake English accent added to American English. You can hear it in all those old political speeches like "The only thing we have to feah is... feah itself!" or "Ask not what yoah country can do for you, but what you can do foah yoah country!"
     
  19. Bruce Johnson

    Bruce Johnson Contributor Contributor Contest Winner 2023

    Joined:
    Jan 9, 2021
    Messages:
    1,340
    Likes Received:
    959

    There's a specific accent in very upper middle class Massachusetts that's similar to this but it may be dying now. I forgot the name of it, but I found a video on YouTube through r/obscure media of two guys having a conversation in it.

    It's similar to the post-rowing scene in 'The Social Network' where the two older guys were like 'Have you ever seen a closer race?"
     
    Xoic likes this.
  20. Bruce Johnson

    Bruce Johnson Contributor Contributor Contest Winner 2023

    Joined:
    Jan 9, 2021
    Messages:
    1,340
    Likes Received:
    959
    Found the video it's called Boston Brahmin.
     
    Xoic likes this.
  21. Xoic

    Xoic Prognosticator of Arcana Ridiculosum Contributor Blogerator

    Joined:
    Dec 24, 2019
    Messages:
    12,460
    Likes Received:
    13,503
    Location:
    Way, way out there
    Wow, it's like a visitation from a couple of Victorian gentlemen! Even the way they're dressed is seemingly from an earlier world. Of course, that was recorded in the 80s. If they're in their 80's that means they were born in the Victorian era. And apprently holdouts of it. My grandpa was about that age, born in 1901, but he didn't talk like that. He was country folk, not upper-echelon gentleman type. In his day boys all boxed and hunted and he hated that our generation didn't do that.
     
    Bruce Johnson likes this.
  22. JLT

    JLT Contributor Contributor

    Joined:
    Mar 6, 2016
    Messages:
    1,857
    Likes Received:
    2,235
    I don't think there was anything fake about those speeches. Roosevelt's "fear" speech was in his natural accent as an upper-class "Hyde Park" New Yorker, and Kennedy's "ask not" speech was in his own upper-class "Brahman" Bostonian accent. Neither of them were attempting to imitate English accents. Compare their speeches to their recorded off-the-cuff comments and you'll find that the accents are identical.
     
  23. Xoic

    Xoic Prognosticator of Arcana Ridiculosum Contributor Blogerator

    Joined:
    Dec 24, 2019
    Messages:
    12,460
    Likes Received:
    13,503
    Location:
    Way, way out there
    Did you think I was saying they speak that way only when they're making public proclamations?
     
  24. Xoic

    Xoic Prognosticator of Arcana Ridiculosum Contributor Blogerator

    Joined:
    Dec 24, 2019
    Messages:
    12,460
    Likes Received:
    13,503
    Location:
    Way, way out there
    Here's what I meant by fake—from Wikipedia:

    The Mid-Atlantic accent, or Transatlantic accent,[1][2][3] is a consciously learned accent of English, fashionably used by the early 20th-century American upper class and entertainment industry, which blended together features regarded as the most prestigious from both American and British English (specifically Received Pronunciation). It is not a native or regional accent; rather, according to voice and drama professor Dudley Knight, "its earliest advocates bragged that its chief quality was that no Americans actually spoke it unless educated to do so".​
     
  25. JLT

    JLT Contributor Contributor

    Joined:
    Mar 6, 2016
    Messages:
    1,857
    Likes Received:
    2,235
    I inferred that from your reference to its use by "public speakers and radio announcers." Forgive me if I misinterpreted you.
     
    Xoic likes this.

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice