Improvements to the forum
-
@lia So, right now, as coded, this RE will strip off the ?t=12 from your example and play the video from the start. Are you saying that you would prefer, instead, for the RE to just not match? Because, if so, then that is actually pretty easy. I will supply that in a followup so I can explain it.
If you would like the "t=" to translate to "start=" then I would need to know the name of the plugin, so I can look at the source and see what would need to change. Can you confirm that this is it: https://github.com/NicolasSiver/nodebb-plugin-ns-embed
If that is the plugin, then what we would do to handle both cases is to create multiple items in the "default-rules.json" data structure, one for no time and another for with time. If that is what you want to do, I can provide the stanzas. But, it is important to note that making more changes may make it more difficult to upgrade the plugin. Generally, I don't recommend people make these kinds of changes to plugins because it complicates security patches.
-
@biell Nail on the head, currently it will always start at 0 and ignore everything else including playlists. In turn making it so you couldn't share a playlist. For text links that had a video it meant the text goes missing, buried in the embed code that we don't see.
That is indeed the plugin. I think it will be best to have it only embed on links without anything on the end of the ID. That way it doesn't embed a timecode link, playlist or text links where youtube is linked.
-
@lia The reason you are having trouble is because the {6,11} is a variable width greedy construct. So, if you put something around it, then the greedy match will just shrink to accommodate it. What we need is a zero width negative lookahead assertion, but the key to its success is to ensure it includes the previous expression, to ensure the {6,11} doesn't just match 10 to keep from failing our negative assertion. Ironically, in this case, the greedy RE match isn't greedy enough and we must force it to gobble up the whole string. With the inclusion of the original character class as a subset of our additional negative assertion character class, the RE will now match URLs without any options (time or otherwise) but will still match simple URLs.
(?:<a.*?)?(?:https?:\/\/)?(?:www\.)?(?:youtube\.com\/(?!user|channel)\S*(?:(?:\/e(?:mbed))?\/|watch\?(?:\S*?&?v\=))|youtu\.be\/)([\w-]{6,11})(?![\w?&-])(?:.*?\/a>)?
Two things here. First, I replace a-zA-Z0-9_ with
\w
because it means the same thing (at least for a URL,\w
accepts more in UTF-8, but we can ignore that). So, now that we have that cleaned up, we have between 6 and 11 (inclusive) of[\w-]
but, critically, not followed by[\w?&-]
which adds two characters to the character class?&
We exclude these because they start variable assignments in an HTTP GET statement, with?
starting the first variable and signifying to the HTTP server that variable arguments are starting, and&
separating all subsequent variables. Because we include both?
and&
, we can matchwww.youtube.com /watch?v=f0i-KnPu4Rw
but notwww.youtube.com /watch?v=f0i-KnPu4Rw&t=12
and we can matchyoutu.be /f0i-KnPu4Rw
without matchingyoutu.be /f0i-KnPu4Rw?t=12
-
@biell That works! Thank you so much :)
I've added a close bracket to the
[\w?&-]
so it looks like[\w?)&-]
in hopes to also have it not embed when in a link text like [Link text](Youtube link) which works in https://regex101.com/ but not here so wonder if it reads that differently. -
@lia I cannot envision why that wouldn't work. You could try putting a backslash
\
in front of the paren)
but that should be completely unnecessary. I ran this in node, and it matches as expected$ node Welcome to Node.js v17.6.0. Type ".help" for more information. > var link=/(?:<a.*?)?(?:https?:\/\/)?(?:www\.)?(?:youtube\.com\/(?!user|channel)\S*(?:(?:\/e(?:mbed))?\/|watch\?(?:\S*?&?v\=))|youtu\.be\/)([\w-]{6,11})(?![\w?)&-])(?:.*?\/a>)?/; undefined > link.exec("https://youtube.com/watch?v=f0i-KnPu4Rw"); [ 'https://youtube.com /watch?v=f0i-KnPu4Rw', 'f0i-KnPu4Rw', index: 0, input: 'https://youtube.com /watch?v=f0i-KnPu4Rw', groups: undefined ] > link.exec("https://youtube.com/watch?v=f0i-KnPu4Rw)"); null > link.exec("https://youtu.be/f0i-KnPu4Rw"); [ 'https://youtu.be /f0i-KnPu4Rw', 'f0i-KnPu4Rw', index: 0, input: 'https://youtu.be /f0i-KnPu4Rw', groups: undefined ] > link.exec("https://youtu.be/f0i-KnPu4Rw)"); null >
Also, this plugin is broken, in my opinion, because it should not attempt to run in a code block.
-
@biell said in Improvements to the forum:
What we need is a zero width negative lookahead assertion
blech... glad that's over!
regex was designed by lizard ppl.
-
@notsure I love regular expressions. But, I do agree that it is a hammer, and not all problems are nails.
In this case, there is no reason for youtube.com and youtu.be URLs to be configured from within the same stanza.
{ "name": "youtube", "displayName": "Youtube", "icon": "fa-youtube", "regex": "(?:<a.*?)?(?:https?:\\/\\/)?(?:www\\.)?(?:youtube\\.com\\/(?!user|channel)\\S*(?:(?:\\/e(?:mbed))?\\/|watch\\?(?:\\S*?&?v\\=))|youtu\\.be\\/)([a-zA-Z0-9_-]{6,11})(?:.*?\\/a>)?", "replacement": "<div class='embed-wrapper'><div class='embed-container'><iframe src='https://www.youtube.com/embed/$1' frameborder='0' allowfullscreen></iframe></div></div>" },
Should be two different configurations, and this idea that they support using them within anchor
<a>
HTML tags was just unnecessary. -
@biell said in Improvements to the forum:
@notsure I love regular expressions.
interesting... on a totally unrelated subject, what color is ur blood lizard man?
-
@biell said in Improvements to the forum:
Also, this plugin is broken, in my opinion, because it should not attempt to run in a code block.
Agreed, I even have the embed plugin after all the others.Markdown occurs before the embed plugin yet still wants to poke at it.
Did wonder if
\W
instead of\w
would work but no joy. What you gave should just work so I think whatever is running it on the back is bugged. At least it doesn't generate embeds for timestamps anymore though so thank you for solving that one :)Timestamp example: https://www.youtube.com/watch?v=f0i-KnPu4Rw?t=10
Playlist example: https://www.youtube.com/watch?v=gidOwEmVq5w&list=PLinsBwlGP89HoLf9d1VwmLIqjBikA38d3 -
@lia said in Improvements to the forum:
if \W instead of \w would work but no joy.
\W
is actually the inverse of\w
, so it matches every character except[a-zA-Z0-9_]
. -
@notsure 😛