Tags:
  1. OurJud

    OurJud Contributor Contributor

    Joined:
    May 21, 2009
    Messages:
    9,502
    Likes Received:
    9,758
    Location:
    England

    Auto tags on posts

    Discussion in 'Support & Feedback' started by OurJud, Oct 3, 2020.

    It seems a little odd to me that more than three tags may be added automatically to opening thread posts, only for the post to be rejected on submission because no more than three are allowed.

    Is there no way to restrict how many are added automatically to avoid the error message?
     
    NWOPD likes this.
  2. Komposten

    Komposten Insanitary pile of rotten fruit Contributor

    Joined:
    Oct 18, 2012
    Messages:
    3,016
    Likes Received:
    2,193
    Location:
    Sweden
    Personally I'd prefer to remove auto tagging altogether. We currently have over 10000 unique tags in the system, most of which are stop words and typos. We can tell the forum that "this tag is a synonym for this other one" or "ban this tag" but doing that manually would take ages.

    Anyway, disabling this feature or limiting it to only put a certain number of tags is a thing only Daniel can fix.
     
    EFMingo, Iain Aschendale and OurJud like this.
  3. big soft moose

    big soft moose An Admoostrator Admin Staff Supporter Contributor Community Volunteer

    Joined:
    Aug 1, 2016
    Messages:
    22,636
    Likes Received:
    25,931
    Location:
    East devon/somerset border
    I'd agree with komp - ive spent acres of time deleting tags from posts because too many bugger up the look of the forum page
     
  4. NWOPD

    NWOPD Administrator

    Joined:
    Oct 16, 2020
    Messages:
    272
    Likes Received:
    330
    This is something I realized is an issue recently while viewing the site on mobile. There are way too many auto tags. It makes the forum cluttered, and many are useless.

    Tagging is meant to be a way to quickly find highly relevant linked content as an alternative to search, forum organization, or prefixes. With so many irrelevant tags is does more harm than good.

    Look at all those tags!

    F78C132A-AF1B-4826-8E09-7CE458E8FE60.png

    I’m going to disable auto-tagging for now.

    Going forward I’m going to re-evaluate the system. It might be in our best interest to remove tag functioning altogether, though that may have negative impacts on search results as we’d lose all of our tag “pages.”

    At some point I do want to come up with a solution to remove or reduce irrelevant auto-tags. That’ll be a project for sure.

    Even allowing member-chosen tags isn’t ideal because anyone can come up with a tag that’s irrelevant or not in use. (Still a step above auto-tags though).

    I think for tags to be useful they need to be limited to 100-200 important keywords and be accurately used. An ideal system might be admin pre-defined tags where members can select from 5-10 relevant tags from a list based on the forum they’re in.

    Auto-tagging is now disabled. I’ll be revisiting the tag system in the future, and am open to suggestions.
     
    jannert, Komposten and OurJud like this.
  5. NWOPD

    NWOPD Administrator

    Joined:
    Oct 16, 2020
    Messages:
    272
    Likes Received:
    330
    I may have spoke too soon: there are some technical difficulties with the tag system, but I will be removing auto-tagging as soon as possible.

    Edit: Disabled the our tagging mod. Tagging still exists: you can tag threads. The numerous old tags still exist. However, I've successfully disabled the auto-tagging feature and removed tags from the thread listing pages (there were so many it was cluttered). Tags still show up at the top of each thread, however.
     
    Last edited: Dec 13, 2020
  6. Komposten

    Komposten Insanitary pile of rotten fruit Contributor

    Joined:
    Oct 18, 2012
    Messages:
    3,016
    Likes Received:
    2,193
    Location:
    Sweden
    You should have seen what it was like before I blacklisted some 200-300 stop words from being used as tags...
    (Words like "the", "a", "me", etc.)
     
    jannert likes this.
  7. NWOPD

    NWOPD Administrator

    Joined:
    Oct 16, 2020
    Messages:
    272
    Likes Received:
    330
    Yikes. I can imagine. Thanks for taking that step. What a nightmare.

    The system we used was a “premium” tag add-on, Tag Essentials, which I believe used to offer useful functionality. We might’ve installed it before XF added built-in tag functionality.

    When I was investigating this I had to research the original modification support thread, and based on other people’s comments auto-tagging used to be more limited/precise but at some point broke and began auto-tagging any random word in post/title. The developer never offered a solution. I’m sorry to say the plugin’s been depreciated for several years and probably broke at some point when we upgraded our XF version.

    I’m sorry it took me so long to address this.

    I think I’ll eventually go through those 10000 tags and delete or merge any that are used less than 5/10/20 times. Page-by-page will be a bitch though... better to come up with an SQL query to do it in 20 seconds. Long-term I definitely want to develop a limited tag system that actually helps with content curation. I feel like tags can be a great feature if implemented properly. With our forum structure it’s almost impossible to find the older high quality content, tags could help with this.
     
  8. Komposten

    Komposten Insanitary pile of rotten fruit Contributor

    Joined:
    Oct 18, 2012
    Messages:
    3,016
    Likes Received:
    2,193
    Location:
    Sweden
    A lot of the tags are typos or alternate spellings of the same thing. Would be nice to have those merged, but that would require going through all 10000 to find which ones are the good ones (so we don't accidentally merge the correct tags into the typo ones).

    Just for fun, I wrote a script a few months ago that scraped all 10000 tags and did pairwise comparisons to get a list of all pairs that were similar to each other. But even after that you'd still need to filter out all the pairs that look similar but are not the same thing (like "store" and "story" in the excerpt below), to make sure that only actual typos/alternative spellings are merged (can probably be done using dictionary look-ups).
    Code:
    {
      "tag1": "storyline ",
      "tag2": "story line ",
      "score": 0.9090909090909091,
      "limit": 0.9
    }, {
      "tag1": "story ",
      "tag2": "story. ",
      "score": 0.8571428571428572,
      "limit": 0.8571428571428571
    }, {
      "tag1": "story length ",
      "tag2": "story lengths ",
      "score": 0.9285714285714286,
      "limit": 0.9
    }, {
      "tag1": "story idea ",
      "tag2": "story ideas ",
      "score": 0.9166666666666666,
      "limit": 0.9
    }, {
      "tag1": "story arc ",
      "tag2": "story arcs ",
      "score": 0.9090909090909091,
      "limit": 0.9
    }, {
      "tag1": "store ",
      "tag2": "story ",
      "score": 0.8333333333333334,
      "limit": 0.8333333333333334
    }
     
  9. big soft moose

    big soft moose An Admoostrator Admin Staff Supporter Contributor Community Volunteer

    Joined:
    Aug 1, 2016
    Messages:
    22,636
    Likes Received:
    25,931
    Location:
    East devon/somerset border
    It would be interesting to know what you mean by 'high quality content'... this has also come up in the pasts with insights and with the articles... where most of whats posted is not in any way high quality
     
  10. NWOPD

    NWOPD Administrator

    Joined:
    Oct 16, 2020
    Messages:
    272
    Likes Received:
    330
    @Komposten that’s very interesting. I sort of understand pairing and dictionary lookups on a conceptual level, but my programming experience is still novice.

    Assuming one could identify the tags to be merged and could filter the “false positives,” could that data be plugged into an SQL action? Would it just be a very long query? I’m sure I could write a query to drop all tags that are only used X number of times, but I’d have no idea how to utilize the data you’re talking about into a query that merged appropriate tags and dropped the others.


    @big soft moose

    Perhaps useful is a better word?

    A new sci-fi writer is trying to find a way to describe another world, but there happens to be a 6-year-old thread with a lot of great replies/ideas. That poster could create a new thread, but they might also benefit from reading a good thread on the same topic. Maybe they create their thread anyway, but miss out on an idea or perspective that could also have benefited them from the tagged thread. Maybe not the best example, just one off the top of my head.

    Sure, a lot of forum threads aren’t high quality content, especially from the perspective of a writer, but that doesn’t mean they aren’t still useful or valuable or should be buried forever, never to be seen again.

    I don’t know, maybe I have an out-of-touch perspective. I know on other forums I’ve participated in I’ve come across very old threads that I found extremely useful or to be exactly what I was looking for.
     
  11. NWOPD

    NWOPD Administrator

    Joined:
    Oct 16, 2020
    Messages:
    272
    Likes Received:
    330
    @big soft moose

    I think I’ve found the perfect example.

    I just jumped randomly to one of the first pages of threads in the publishing forum and randomly selected a thread.

    I found this thread I created in 2007. I was a 17-year-old new to writing and my post and replies are so cringing it’s clear I had no idea what I was talking about or even asking. This OP is the definition of the type of low quality post you’re assuming.

    The reply though? @TWErvin2 replied (second post) with a detailed, useful, and dare-I-say high quality post explaining the foundations of the industry and publishing short stories.

    In fact, there are many useful replies in that thread. Many - most - threads probably wouldn’t meet the same bar. But this is the type of content I’m talking about.

    My original post was newbie garbage. His reply is one I think many people would benefit from if it wasn’t buried in the abyss.
     
    Komposten likes this.
  12. Komposten

    Komposten Insanitary pile of rotten fruit Contributor

    Joined:
    Oct 18, 2012
    Messages:
    3,016
    Likes Received:
    2,193
    Location:
    Sweden
    I don't think a single SQL query would work (even a very big one), since it would be a lot of smaller actions (merge A into B, C into D, E into F, etc.). But it would be reasonably simple to just generate one query per merge and then run them all as a single transaction. It's worth keeping in mind, though, that if you merge A into B using SQL then you also need to update all references to A to point to B instead (assuming XenForo uses SQL in the correct way, the query should fail and roll back if this isn't handled).

    The hard part is still finding all the false positives (even matching against a dictionary is challenging due to typos in the tags).
     
  13. big soft moose

    big soft moose An Admoostrator Admin Staff Supporter Contributor Community Volunteer

    Joined:
    Aug 1, 2016
    Messages:
    22,636
    Likes Received:
    25,931
    Location:
    East devon/somerset border
    It is, but on the flip side its also 13 years out of date... the best advice from the biggest names in the business is nearly worthless after that long (Stephen King says as much in the foreword of his new edition of On Writing)

    Which neatly illustrates my point about the issue with the 'insights' forum and to an extent the articles.. that creating a pile of high value posts is difficult, when the value of the posts degrades over time (especially true of issues around publishing)

    Another examples is that for the last two or three years ive been engaged in writing a book as a definitive guide to self publishing... i went back through the marketting chapters the other day and nearly everything i wrote two years ago is now wrong... facebook ads have changed, amazon ads have changed massively, wordpress has had the guttenberg update, the landscape around bookbub ads is now completely different, ... the whole section needs a rewrite. what was 'high value' content two years ago is now dated and unmarketable trash
     
    Komposten likes this.
  14. OurJud

    OurJud Contributor Contributor

    Joined:
    May 21, 2009
    Messages:
    9,502
    Likes Received:
    9,758
    Location:
    England
    You clearly haven't been following my posts in the Random Thought thread.
     
    Xoic likes this.
  15. NWOPD

    NWOPD Administrator

    Joined:
    Oct 16, 2020
    Messages:
    272
    Likes Received:
    330
    @big soft moose I see your point, I don’t disagree, but I don’t agree either. Marketing in particular is a field that changes every 6-12 months, so is probably the worst example for my point and best for yours you could make, and I’d argue is an outlier in this conversation due to its rapid time degradation.

    Surely we can find some middle ground on this topic? Perhaps we highlight content that meets a higher standard of usefulness, objectively higher quality, has lower time degradation, acts as some sort of reference or guide, or otherwise stands out as notable? I only see potential upsides, where are the downsides?

    When it comes to tagging, we’re literally tagging everything with anything, surely changing this so tags only highlight useful or notable content is a good thing? I know you’re also talking about other sections of the site too, but in this instance, how can that be a bad thing?

    Tags are sorted from most recent to oldest. This would mean recent posts wouldn’t face the issue you described, and as you moved down the list, time degradation would take affect, but would be obvious when viewing chronologically. The effect being that if we curate and highlight good content it will reach more eyes, help more people, and if it loses its value due to time, it’s merely at the bottom of the list where it can offer historical value. :p

    Where’s my laugh react.
     
    OurJud likes this.
  16. big soft moose

    big soft moose An Admoostrator Admin Staff Supporter Contributor Community Volunteer

    Joined:
    Aug 1, 2016
    Messages:
    22,636
    Likes Received:
    25,931
    Location:
    East devon/somerset border
    the biggest problem with tags as they exist currently is that they arent useful for anything and take up a lot of screen room... i'd agree in theory that only tagging useful topics is a good move... my issue is how we make sure that only those useful topics are tagged... unless we go through and apply them post facto which will be a major job for someone to do

    My major issue isn't to do with tags at all though (and i'm in the wrong thread)... what i was saying above was more to do with the nightmare we've had with determining what "high quality "sticky quality" posts that add value and insight into the writer's journey" (from the I&I description) actually are
     

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice