Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  

Author Topic: Youtube P-Score?  (Read 2514 times)

Zanzetkuken The Great

  • Bay Watcher
  • The Wizard Dragon
    • View Profile
Youtube P-Score?
« on: November 01, 2019, 09:13:07 pm »

Some guy by the name of Optimus did some looking through Youtube's source code and found some interesting stuff about the algorithm.  They did some summarization in this video, but got some stuff written down in these two documents:
https://docs.google.com/document/d/1xyxDZIGztWDqGQGae4Oakkt0VAYB21-OcXgNKYrPzcw/edit
https://docs.google.com/spreadsheets/d/130CDsPSjg2BzzlA476AxjZQDdGiHGejhwV_F1H99RMs/edit

Looking through the past, it seems one of Youtube's verified side channels "Youtube Advertisers" released this video a couple months ago, so it does seem a bit legitimate.  Though based on a following video by them here, Youtube has already did some changes on the source side so it can't be personally verified.
Logged
Quote from: Eric Blank
It's Zanzetkuken The Great. He's a goddamn wizard-dragon. He will make it so, and it will forever be.
Quote from: 2016 Election IRC
<DozebomLolumzalis> you filthy god-damn ninja wizard dragon

Zanzetkuken The Great

  • Bay Watcher
  • The Wizard Dragon
    • View Profile
Re: Youtube P-Score?
« Reply #1 on: November 01, 2019, 10:10:32 pm »

Oh! So that's how they're doing it.


For reference, those several metrics are being fed into a neural network of some variety to churn out the score. Even Google has no clue how the weights were worked out; any supposed "tweaks" to the algorithm are consequences of it still continuing to learn from new data. This is evident by the fact that they're trying to work out several data "features" of a Youtube video. Now, how they actually extract those features is unknown, but that's almost certainly what they're doing with them to calculate said score.

It's pretty suspicious that they're apparently trying to bury this information now through the removal of the information that was found.  And I must admit, however they are calculating data is bloody strange.  Like, why is the Wall Street Journal and the New York Times being just barely over 800 when even the non-US based SET India is floating in the mid-870s?
Logged
Quote from: Eric Blank
It's Zanzetkuken The Great. He's a goddamn wizard-dragon. He will make it so, and it will forever be.
Quote from: 2016 Election IRC
<DozebomLolumzalis> you filthy god-damn ninja wizard dragon

Iduno

  • Bay Watcher
    • View Profile
Re: Youtube P-Score?
« Reply #2 on: November 01, 2019, 11:59:21 pm »

It's pretty suspicious that they're apparently trying to bury this information now through the removal of the information that was found.  And I must admit, however they are calculating data is bloody strange.  Like, why is the Wall Street Journal and the New York Times being just barely over 800 when even the non-US based SET India is floating in the mid-870s?

Yeah, I'm not sure storing what they found on Google's servers was a good plan.
Logged

Reelya

  • Bay Watcher
    • View Profile
Re: Youtube P-Score?
« Reply #3 on: November 02, 2019, 12:59:53 am »

Oh! So that's how they're doing it.


For reference, those several metrics are being fed into a neural network of some variety to churn out the score. Even Google has no clue how the weights were worked out; any supposed "tweaks" to the algorithm are consequences of it still continuing to learn from new data. This is evident by the fact that they're trying to work out several data "features" of a Youtube video. Now, how they actually extract those features is unknown, but that's almost certainly what they're doing with them to calculate said score.

It's pretty suspicious that they're apparently trying to bury this information now through the removal of the information that was found.  And I must admit, however they are calculating data is bloody strange.  Like, why is the Wall Street Journal and the New York Times being just barely over 800 when even the non-US based SET India is floating in the mid-870s?

Just look at subscribers. SET India has almost 60 million subscribers. The P metrics take that into account. It's the 5th most subscribed channel on Youtube. New York Times has 2.2 million. Not even in the ballpark. This isn't a problem with the metrics, it's the fact that the channels you mentioned aren't very popular.
« Last Edit: November 02, 2019, 01:03:14 am by Reelya »
Logged

scourge728

  • Bay Watcher
    • View Profile
Re: Youtube P-Score?
« Reply #4 on: November 02, 2019, 12:07:21 pm »

Such a strange set of things on the top of the list, including nursery rhymes for some reason

Reelya

  • Bay Watcher
    • View Profile
Re: Youtube P-Score?
« Reply #5 on: November 02, 2019, 12:11:31 pm »

Such a strange set of things on the top of the list, including nursery rhymes for some reason

Yeah. And just so nobody has to google it for context, this is referring to the most subscribed channels, not just the P-scores.

https://en.wikipedia.org/wiki/List_of_most-subscribed_YouTube_channels

List of most-subscribed YouTube channels

1) T-Series
2) PewDiePie
3) Cocomelon - Nursery Rhymes
4) 5-Minute Crafts
5) SET India

Two of the top 5 are Bollywood related, one is gamer-related, one is a children's channel, the other is a dumb crafts channel. Where I think the dissonance is coming from is demographic: we fail to see the wider demographics here.

The top-subscribed channels don't conform to our biases - Western Millennial and Gen-X mostly-male gamers, so we think that it's "weird" that these things have so many subscribers, when in fact its mostly the stuff that we like that's niche and fringe. For example the prev poster thought it was weird that "even" a "non-US" channel had such a high P-score, when that channel is clearly (even in the name) in a country with about 1.5 billion people living there. That alone suggests why it could have such a high score.
« Last Edit: November 02, 2019, 12:18:25 pm by Reelya »
Logged

scourge728

  • Bay Watcher
    • View Profile
Re: Youtube P-Score?
« Reply #6 on: November 02, 2019, 12:15:25 pm »

I have this strange feeling bots may be involved :P I have no proof other than do that many people really enjoy nursery rhymes that much?

Reelya

  • Bay Watcher
    • View Profile
Re: Youtube P-Score?
« Reply #7 on: November 02, 2019, 12:27:26 pm »

That's the demographic issue you're missing. Who do you think clicks that? People with kids do, people who use the Youtube Kids app. Parents are also more concerned about their kids finding stuff they're not supposed to, so they're more likely to just subscribe to a channel and only play stuff from the channel.

Look at their top videos on that channel, they have 200+ million views. And looking at their video output, they have 18 videos dated less than 2 months ago. So they're pushing out two new 3D-animated videos every week. Note that the longer videos actually have a big red logo in the top right of the thumbnail showing how many minutes they go for. This is clearly targeted at parents, and explains why so many subscribe. They subscribe so they can quickly find this channel, pick a suitably long video and get stuff done while the toddler is entertained.

This is the way you get to the top of Youtube, is to keep pushing out content on a regular basis, and clearly understand the use-case of your target audience.
« Last Edit: November 02, 2019, 12:50:17 pm by Reelya »
Logged

Bumber

  • Bay Watcher
  • REMOVE KOBOLD
    • View Profile
Re: Youtube P-Score?
« Reply #8 on: November 02, 2019, 12:41:19 pm »

For example the prev poster thought it was weird that "even" a "non-US" channel had such a high P-score, when that channel is clearly (even in the name) in a country with about 1.5 billion people living there.
Fake news! America's P-score is yuge! We've got the biggest P-score in the world!
Logged
Reading his name would trigger it. Thinking of him would trigger it. No other circumstances would trigger it- it was strictly related to the concept of Bill Clinton entering the conscious mind.

THE xTROLL FUR SOCKx RUSE WAS A........... DISTACTION        the carp HAVE the wagon

A wizard has turned you into a wagon. This was inevitable (Y/y)?

Zanzetkuken The Great

  • Bay Watcher
  • The Wizard Dragon
    • View Profile
Re: Youtube P-Score?
« Reply #9 on: November 02, 2019, 04:08:53 pm »

For example the prev poster thought it was weird that "even" a "non-US" channel had such a high P-score, when that channel is clearly (even in the name) in a country with about 1.5 billion people living there. That alone suggests why it could have such a high score.

My confusion was more on the basis of 'why are these guys so far away from the more or less similar sized CNN, Fox News, MSNBC, and, to an extent, BBC and CBC.  Hell, even VICE and CBS are tied around SET India, and the two I mentioned (USA Today is apparently down with them as well).  Given the extent of name recognition that exists for the Wall Street Journal and New York Times, seems strange that they are that low compared to all the rest of those.
Logged
Quote from: Eric Blank
It's Zanzetkuken The Great. He's a goddamn wizard-dragon. He will make it so, and it will forever be.
Quote from: 2016 Election IRC
<DozebomLolumzalis> you filthy god-damn ninja wizard dragon

Reelya

  • Bay Watcher
    • View Profile
Re: Youtube P-Score?
« Reply #10 on: November 03, 2019, 12:47:17 am »

For example the prev poster thought it was weird that "even" a "non-US" channel had such a high P-score, when that channel is clearly (even in the name) in a country with about 1.5 billion people living there. That alone suggests why it could have such a high score.

My confusion was more on the basis of 'why are these guys so far away from the more or less similar sized CNN, Fox News, MSNBC, and, to an extent, BBC and CBC.  Hell, even VICE and CBS are tied around SET India, and the two I mentioned (USA Today is apparently down with them as well).  Given the extent of name recognition that exists for the Wall Street Journal and New York Times, seems strange that they are that low compared to all the rest of those.

Name recognition isn't one of the metrics, because if you check, that hasn't translated across into subscribers and youtube views. There's a huge gap between franchises who have built their model around Youtube, and traditional media who are just basically piddling in the kiddy-end of the pool by comparison.

It's pretty suspicious that they're apparently trying to bury this information now through the removal of the information that was found.

BTW I wouldn't say it's "suspicious". It's necessary. No matter how good the algorithm is, people will work out how to game it. So if the algorithm becomes public you probably need to change the algorithm. There's money in cheating the system, and if you give out a "cheaters guide" then you have to spend a whole lot more money and be a whole lot more intrusive with how you algorithm works to detect the cheats.
« Last Edit: November 03, 2019, 02:07:37 am by Reelya »
Logged

Zanzetkuken The Great

  • Bay Watcher
  • The Wizard Dragon
    • View Profile
Re: Youtube P-Score?
« Reply #11 on: November 03, 2019, 09:46:03 am »

BTW I wouldn't say it's "suspicious". It's necessary. No matter how good the algorithm is, people will work out how to game it. So if the algorithm becomes public you probably need to change the algorithm. There's money in cheating the system, and if you give out a "cheaters guide" then you have to spend a whole lot more money and be a whole lot more intrusive with how you algorithm works to detect the cheats.

Yeah, if the information was public (the features and the generated p-values from them) then you could trivially "cheat" a neural network by training one on those values alone to re-create the functionality of Google's own internal method, letting you generate ideal versions of those scores to maximize p-scores. Then you just reverse-engineer how those features were worked out (likely rather straightforward all things considered, and something you could also train via neural network) and you can generate ideal video content to maximize p-values.

So yeah, having that info public spells disaster. The algorithm being "public" in this case doesn't really matter, because as long as the actual classifiers they use per video are private, there's no trivial way to reproduce. With that info public, though, then reproducing is trivial.

I find that our opinions are seriously divergent in this case, as while there may be that risk, I do not view it as a great enough risk compared to the resultant damage of how the creators on the platform are being screwed over from the lack of information as to what their content is being generally rated on the six category Y to X scale, the lack of information as to why in particular a video is being restricted for ads, and the lack of information as to which videos the algorithm views as good so they can improve their content.  You are talking about a problem of a few thousand on a platform of multiple millions.  Sure, not all of those are active, but they would still be the vast majority of channels with content made on the platform.

I mean, hell, you guys are acting like all the different types of TV ratings for content and viewership aren't a thing, and the P-score aspect isn't similar to trying to determine the equivalent to Primetime for an internet-based platform.
Logged
Quote from: Eric Blank
It's Zanzetkuken The Great. He's a goddamn wizard-dragon. He will make it so, and it will forever be.
Quote from: 2016 Election IRC
<DozebomLolumzalis> you filthy god-damn ninja wizard dragon