ESPN (A bunch of Death-Spiraling maroons)

dragurd · Nov 11, 2019

Wingate1217 said:
I love how they believe Clemson's remaining schedule of Wake Forest Gump and South (I just got beat but App State)Carolina is tougher than OSU with Penn State and Michigan......

Well in their defense Rutgers probably drags that down whole levels by themselves

DaddyBigBucks · Nov 11, 2019

Well

NOT in their defense, if they're just averaging the FPI of the remaining opponents to get "remaining strength of schedule", that just shows that they have no business whatever telling anyone what numbers will tell you and what they won't, because they clearly have no idea. It wouldn't take much effort to come up with a metric better than FPI, it would take less to come up with a better measure of strength of schedule. No reasonable person would look at the Buckeyes' remaining schedule and Clemson's (or LSU's) and not say that the Buckeyes' slate is far tougher.

buckeyemania11 · Nov 11, 2019

My main problem with things like the FPI is, if it’s real based off “complicated math of hard data” (like the guy who runs it claims) how the hell can you have a preseason ranking? When for a new season there is no data yet. At that point you can only assume the preseason ranking is BS based on past assumptions and biases, which then in turn sets the stage for BS rankings the rest of the season since it starts based on said original BS.

kujirakira · Nov 12, 2019

buckeyemania11 said:
I've been trying to understand the logic of the FPI, but its just not possible.

Good example.

The description claims the rankings are based on where they expect teams to finish the rest of the season, based on whatever wonky ass math they decide to throw out.

Iowa State is ranked ahead of Baylor

Baylor has 0 losses Iowa State has 4

Baylor beat Iowa State head to head

They project Baylor to finish with 11 wins and Iowa State with 7

Yet according to their rankings Iowa State is the "better team" the rest of the season.

They're obviously using some in-house Machine Learning. They even talk about other people's algorithms at times.
Nominally, id assume this was supervised. But it's such a cluster, it almost has an unsupervised 'learning to play video game by pressing random buttons' feel to it.
But who has any idea what their 'factors' are. Given the stratification of conferences, conf and/or team identity must be one. That is to say, the algorithm knows CFP/AP/Coaches over-ranks certain teams and treats that as emblematic of reality... ie an objective fact rather than a subjective perception.
They have several versions of same thing too... ie, trying to predict what CFP will vote. Even though the voters change every year... and thus the dynamics around edge cases are inherently not a constant ...

In short, i think it's a highly complex nonsense generator with some implicitly built-in business biases to skew the numbers.

kujirakira · Nov 12, 2019

kujirakira said:
They're obviously using some in-house Machine Learning. They even talk about other people's algorithms at times.
Nominally, id assume this was supervised. But it's such a cluster, it almost has an unsupervised 'learning to play video game by pressing random buttons' feel to it.
But who has any idea what their 'factors' are. Given the stratification of conferences, conf and/or team identity must be one. That is to say, the algorithm knows CFP/AP/Coaches over-ranks certain teams and treats that as emblematic of reality... ie an objective fact rather than a subjective perception.
They have several versions of same thing too... ie, trying to predict what CFP will vote. Even though the voters change every year... and thus the dynamics around edge cases are inherently not a constant ...

In short, i think it's a highly complex nonsense generator with some implicitly built-in business biases to skew the numbers.

... spitballing, but it would make a very interesting Case Study on the limits and mis/abuses of Machine Learning.
Grad research would easily get at least one paper out of it. Probably several when factoring reverse Eng of ML, exploration of how ML could be applied to Sports Analysis, and then implementing that competent ML on the topic.
My institution doesnt have grad students and a bit ahead of our undergrads imo, but seems like something Ohio State CS (or any B1G school) could dabble in.
It's fuck ntre ame week here.

LovelandBuckeye · Nov 12, 2019

kujirakira said:
It's fuck ntre ame week here.

Just this week? Need step that hate game up, bruh.

kujirakira · Nov 12, 2019

LovelandBuckeye said:
Just this week? Need step that hate game up, bruh.

Punching above weight class 'rivalry' and distant away game... people get more worked up about the 2 sister schools. Especially the one played in Philly...
:shrug:

Zippercat · Nov 12, 2019

Wingate1217 said:
I love how they believe Clemson's remaining schedule of Wake Forest Gump and South (I just got beat but App State)Carolina is tougher than OSU with Penn State and Michigan......

Y’all are failing, again, to SEC, SEC! Sure South Carolina lost to Appy State. But South Carolina beat Georgia. And Appy State has a win against an SEC team so that’s nearly a good loss for USC!

Neither of Ohio State’s last two regular season opponents beat any SEC teams. So there you have it.....weaker schedule.

sparcboxbuck · Nov 12, 2019

kujirakira said:
They're obviously using some in-house Machine Learning. They even talk about other people's algorithms at times.
Nominally, id assume this was supervised. But it's such a cluster, it almost has an unsupervised 'learning to play video game by pressing random buttons' feel to it.
But who has any idea what their 'factors' are. Given the stratification of conferences, conf and/or team identity must be one. That is to say, the algorithm knows CFP/AP/Coaches over-ranks certain teams and treats that as emblematic of reality... ie an objective fact rather than a subjective perception.
They have several versions of same thing too... ie, trying to predict what CFP will vote. Even though the voters change every year... and thus the dynamics around edge cases are inherently not a constant ...

In short, i think it's a highly complex nonsense generator with some implicitly built-in business biases to skew the numbers.

It’s a simple least squares model based on what he’s explained to me. When I’ve called him out on it he says it ‘does what it is supposed to do’ and then references least squared error. That’s when I went after him on his choice of loss functions.

Anyhow, at a minimum, for early season game predictions to be of any value, he should be considering a Bayesian model where the prior is based on things like returning players, their stats, attendance at road games and other similar things. Picking the prior should not be overly difficult as one in that position has ample historical data to work with and can use a multitude of features from a prior year to predict the following year.

It used to be that things like a Gibbs Sampling / Markov Chain Monte Carlo model was a LOT of work to pull off. In ‘95 I was involved in writing a Bayes Multinomial Probit in a matrix algebra language (SAS/IML) and it sucked... there was no direct function for a Kronecker product if I recall correctly among other things... but now, with the general availability of every model under the sun with Python and R, there’s no excuse for shit models. If the results are not good, it just means the modeler is uninformed or fucking lazy. In this case, I think it may be a combination of the two.

I mean hell, he probably doesn’t even need to go that complex. I imagine that an XGBoost model that included some features to represent past year’s history and some type of feature to represent time (number of game played in current season) would create interactions that would fairly naturally weight early season predictions to the historical data and create a more accurate prediction in weeks 1-5.

Zippercat · Nov 12, 2019

sparcboxbuck said:
It’s a simple least squares model based on what he’s explained to me. When I’ve called him out on it he says it ‘does what it is supposed to do’ and then references least squared error. That’s when I went after him on his choice of loss functions.

Anyhow, at a minimum, for early season game predictions to be of any value, he should be considering a Bayesian model where the prior is based on things like returning players, their stats, attendance at road games and other similar things. Picking the prior should not be overly difficult as one in that position has ample historical data to work with and can use a multitude of features from a prior year to predict the following year.

It used to be that things like a Gibbs Sampling / Markov Chain Monte Carlo model was a LOT of work to pull off. In ‘95 I was involved in writing a Bayes Multinomial Probit in a matrix algebra language (SAS/IML) and it sucked... there was no direct function for a Kronecker product if I recall correctly among other things... but now, with the general availability of every model under the sun with Python and R, there’s no excuse for shit models. If the results are not good, it just means the modeler is uninformed or fucking lazy. In this case, I think it may be a combination of the two.

I mean hell, he probably doesn’t even need to go that complex. I imagine that an XGBoost model that included some features to represent past year’s history and some type of feature to represent time (number of game played in current season) would create interactions that would fairly naturally weight early season predictions to the historical data and create a more accurate prediction in weeks 1-5.

AJHawkfan · Nov 12, 2019

sparcboxbuck said:
It’s a simple least squares model based on what he’s explained to me. When I’ve called him out on it he says it ‘does what it is supposed to do’ and then references least squared error. That’s when I went after him on his choice of loss functions.

Anyhow, at a minimum, for early season game predictions to be of any value, he should be considering a Bayesian model where the prior is based on things like returning players, their stats, attendance at road games and other similar things. Picking the prior should not be overly difficult as one in that position has ample historical data to work with and can use a multitude of features from a prior year to predict the following year.

It used to be that things like a Gibbs Sampling / Markov Chain Monte Carlo model was a LOT of work to pull off. In ‘95 I was involved in writing a Bayes Multinomial Probit in a matrix algebra language (SAS/IML) and it sucked... there was no direct function for a Kronecker product if I recall correctly among other things... but now, with the general availability of every model under the sun with Python and R, there’s no excuse for shit models. If the results are not good, it just means the modeler is uninformed or fucking lazy. In this case, I think it may be a combination of the two.

I mean hell, he probably doesn’t even need to go that complex. I imagine that an XGBoost model that included some features to represent past year’s history and some type of feature to represent time (number of game played in current season) would create interactions that would fairly naturally weight early season predictions to the historical data and create a more accurate prediction in weeks 1-5.

You know.... I used to work with a guy who I bet you'd really get along with well. (No one really liked him much either.) :shrug:

Jaxbuck · Nov 12, 2019

sparcboxbuck said:
It’s a simple least squares model based on what he’s explained to me. When I’ve called him out on it he says it ‘does what it is supposed to do’ and then references least squared error. That’s when I went after him on his choice of loss functions.

Anyhow, at a minimum, for early season game predictions to be of any value, he should be considering a Bayesian model where the prior is based on things like returning players, their stats, attendance at road games and other similar things. Picking the prior should not be overly difficult as one in that position has ample historical data to work with and can use a multitude of features from a prior year to predict the following year.

It used to be that things like a Gibbs Sampling / Markov Chain Monte Carlo model was a LOT of work to pull off. In ‘95 I was involved in writing a Bayes Multinomial Probit in a matrix algebra language (SAS/IML) and it sucked... there was no direct function for a Kronecker product if I recall correctly among other things... but now, with the general availability of every model under the sun with Python and R, there’s no excuse for shit models. If the results are not good, it just means the modeler is uninformed or fucking lazy. In this case, I think it may be a combination of the two.

I mean hell, he probably doesn’t even need to go that complex. I imagine that an XGBoost model that included some features to represent past year’s history and some type of feature to represent time (number of game played in current season) would create interactions that would fairly naturally weight early season predictions to the historical data and create a more accurate prediction in weeks 1-5.

I bought a Kronecker product once, figured it was German and Germans always make good stuff.

It sucked. Sent it back.

Bill Lucas · Nov 12, 2019

sparcboxbuck said:
It’s a simple least squares model based on what he’s explained to me. When I’ve called him out on it he says it ‘does what it is supposed to do’ and then references least squared error. That’s when I went after him on his choice of loss functions.

Anyhow, at a minimum, for early season game predictions to be of any value, he should be considering a Bayesian model where the prior is based on things like returning players, their stats, attendance at road games and other similar things. Picking the prior should not be overly difficult as one in that position has ample historical data to work with and can use a multitude of features from a prior year to predict the following year.

It used to be that things like a Gibbs Sampling / Markov Chain Monte Carlo model was a LOT of work to pull off. In ‘95 I was involved in writing a Bayes Multinomial Probit in a matrix algebra language (SAS/IML) and it sucked... there was no direct function for a Kronecker product if I recall correctly among other things... but now, with the general availability of every model under the sun with Python and R, there’s no excuse for shit models. If the results are not good, it just means the modeler is uninformed or fucking lazy. In this case, I think it may be a combination of the two.

I mean hell, he probably doesn’t even need to go that complex. I imagine that an XGBoost model that included some features to represent past year’s history and some type of feature to represent time (number of game played in current season) would create interactions that would fairly naturally weight early season predictions to the historical data and create a more accurate prediction in weeks 1-5.

da fuq are you babbling about? Lol

sparcboxbuck · Nov 12, 2019

AJHawkfan said:
You know.... I used to work with a guy who I bet you'd really get along with well. (No one really liked him much either.)

In fairness, none of us are really all that likable. And I'll be the first to admit it.

sparcboxbuck · Nov 12, 2019

Bill Lucas said:
da fuq are you babbling about? Lol

Do you REEEEEAAAAALY want to know? LOL.

Imma assume that's a no.

ESPN (A bunch of Death-Spiraling maroons)

dragurd

Head Coach

DaddyBigBucks

Administrator

buckeyemania11

HATE, HATE, HATE, HATE!!!

kujirakira

Head Coach

kujirakira

Head Coach

LovelandBuckeye

You never lose to those pricks. Ever. Ever. - UFM

kujirakira

Head Coach

Zippercat

Legend

sparcboxbuck

What happened to my ¤cash?

Zippercat

Legend

AJHawkfan

Wanna make $14 the hard way?

Jaxbuck

I hate tsun

Bill Lucas

Assistant Coach

sparcboxbuck

What happened to my ¤cash?

sparcboxbuck

What happened to my ¤cash?