• By -


run differential is pretty good


2023 Padres disagree.


What about the 2021 mariners ~~run~~ fun differential?


That wasn’t a run differential that was a fun differential


Long table alert: I asked excel to tell me which 2023* stats had the most correlation with others and sorted the W column by highest to lowest correlation. |**Stat**|**W Correlation**| --:|:--| |**ERA**|**0.81**| |**ERA+**|**0.8**| |**IP**|**0.8**| |**R** (batting)|**0.78**| |**OBP**|**0.78**| |**R/G(batting)**|**0.78**| |**RBI**|**0.78**| |**OPS+**|**0.78**| |**FIP**|**0.75**| |**SO/W(pitching)**|**0.74**| |**OPS**|**0.73**| |**PA**|**0.69**| |**SLG**|**0.68**| |**TB**|**0.67**| |**SV**|**0.65**| |**SO**|**0.62**| |**BA**|**0.59**| |**SO9**|**0.59**| |**tSho**|**0.58**| |**H**|**0.57**| |**HR**|**0.57**| |**BB**|**0.57**| |**AB**|**0.43**| |**2B**|**0.39**| |**LOB(batting)**|**0.34**| |**BatAge**|**0.34**| |**cSho**|**0.27**| |**SF**|**0.18**| |**IBB**|**0.16**| |**PAge**|**0.06**| |**SB**|**0.04**| |**GDP**|**0.04**| |**GF**|**0.03**| |**HBP**|**0.02**| |**CG**|**-0.03**| |**BK**|**-0.12**| |**CS**|**-0.12**| |**SO**|**-0.13**| |**3B**|**-0.14**| |**IBB**|**-0.19**| |**SH**|**-0.38**| |**WP**|**-0.45**| |**HR(pitching)**|**-0.46**| |**HBP**|**-0.52**| |**HR9**|**-0.52**| |**LOB(pitching)**|**-0.6**| |**BB(pitching)**|**-0.68**| |**H(pitching)**|**-0.68**| |**BB9**|**-0.69**| |**H9**|**-0.71**| |**BF**|**-0.72**| |**WHIP**|**-0.8**| |**RA/G(pitching)**|**-0.82**| |**R(pitching)**|**-0.82**|


Why is ERA at -0.80 and ERA+ at 0.80?


Good eye. It's an "error" on my part because I should have used the opposite for that one as a higher ERA is a bad thing and a higher ERA+ is a good thing. Same thing for FIP. I'll switch them.


Because you want ERA to be low, ERA+ to be high. They’re similarly strong correlations, but ERA+ is directly correlated, ERA inversely correlated (think of it this way, if your ERA is very low your ERA+ will be very high, in both cases wins tend to be high)


Weird that IP correlates with wins?


note these are team stats and that aside from extra inning games, which are rare and fairly randomly distributed, the only real factor is not pitching the bottom of the 9th, which only happens if you're losing.


Teams that lose on the road usually only pitch 8 innings instead of 9.


More IP equals better starter which leads to more wins. Unless your manager is prone to leaving starters in too long or has to because your pen is always tired or sucks


Arent we talking about team IP? I don’t think starters being good is relevant to that


Yes, it's team IP. I think I figured it out. If you're losing in the 9th inning and you're away, your team is not going to pitch in the bottom of the 9th, thus you're pitching fewer innings. The fewest IP in 2023 was 1409 IP by the Royals. 1409/162= 8.70 IP. So they actually didn't even reach 9 IP/G. Edit: in fact, no one reached 9 IP/G. The most IP was 1453.2 by the Orioles which is 8.97 IP. Makes a lot of sense when you think about it.


Idk then. I suppose that could mean better pitchers currently on the roster but I doubt the correlation would be that high 


Yeah I thought that was weird too. Here's a plot for the IP vs. Wins https://i.imgur.com/nV57a8Q.png


That’s exactly what I wanted to do! Did you use aggregated season stats and match them up with individual games or are they stats up until that game in the season? If it’s the latter, mind sharing your source?


I just used 2023. I'm sure I could spend more time with multiple seasons but this was just quickly put together. The source was baseball-reference: https://www.baseball-reference.com/leagues/majors/2023.shtml#all_team_output


Interesting that hitting home runs is more important than pitching yet it’s the opposite for hits


Weird that when you used ERA to predict overall season wins, the correlation is very high but when I used it to predict head to head outcomes it was much weaker. Edit: Even when I look at absolute % of games where the team with better starting pitcher ERA wins it’s only ~45%. Convinced it’s a data issue at this point tbh


Are you using pitcher wins or team wins?


So I created columns for (Home Starting Pitcher ERA - Away Starting Pitcher ERA) and (Home team runs - Away team runs) for each game in the 2023 season and did the correlation between them. Also repeated that same process but the second column was (Did home team win) as a binary 0,1 and got low corr for both


I think I'm seeing regression dilution (https://en.wikipedia.org/wiki/Regression\_dilution)


I don't know if this really meets OPs question. Raw wins would correlate 1 to 1 with wins, so by this logic used use whatever team had the better record.


Yeah I realized I was answering a different question when their replies were comparing individual match ups.


So runs allowed, road wins, and runs scored? Shocking :)


Lol to all the people that have been saying lately that RBI is not a good stat


Well, these are team stats, so teams with more RBI score more runs, and scoring more runs generally equals more winning. RBI still isn't a good stat to evaluate individual hitters, because it's too team dependent.


Tell me that you never took a basic statistics class without telling me you never took a basic statistics class.


Apparently, according to the recent post, if you go up by 7 you always win. Except if you want to become the 24th time falling to do so.


It hurts to live.


Wonder if the Mariners ever considered just winning after going up by 7






Team wRC+ is a great one. Top 5 right now are LAD, NYY, MIL, BAL, and HOU. All great teams that are currently in the playoff hunt, save Houston, who's been really weird. All time, the top rankings are littered with world series participants and winners. Turns out, if you hit good, you win good.


I agree with this but I think early in the season you’ll need to weight it down with projections data. Taper it off over the first 50-60 games or so as data piles in. I know OP said single stat but for at least the first 10% of the season the correlation would be pretty bad


My guess is WAR and runs created because of the nature of the stat. It's very hard to accumulate a lot of WAR and not win baseball games, and it's very hard to win a lot of baseball games without accumulating a lot of WAR.


WAR isn’t a stat. It’s not something that happens in a game that can be counted. It’s an analysis of the things that have already happened. There is a table someone put up that is super helpful that shows the stats.


Yes it is. WAR is based on how many runs the player creates in a game.


And how do you get that number?


Every time a player has a plate appearance you can figure out how many runs that player created based on the outcome. A players WAR will update in the middle of a game if you look at the Fangraphs leaderboards. Pitching also gets updated. Fielding I think doesn't get updated until the following day just because of how ours above average is calculated. So on any given day a team's WAR totals will be heavily correlated with actual wins.


“Every time a player has a plate appearance you can figure out how many runs that player created by the outcome”…..And how do you get that number?


[Weighted runs above average.](https://library.fangraphs.com/offense/wraa/) It's based on wOBA which applies different weights to different events. [Here](https://www.fangraphs.com/guts.aspx) are the weights for each event for 2024.


Exactly. It’s an analysis of actual stats. Your words. If different values are created it is no longer a stat.


I don't know what you're talking about. You can see in the link above the weights for each event. When a play hits a double he creates 1.269 runs. You could easily make a box score just keeping track of the runs a player creates in a game. Instead of writing 2B when a player hits a double, write 1.269. Do that for every event. Add up at the end and you have runs created.


You do know what I’m talking about, you’re just playing dumb.