Batting Second: Tony Gwynn RF
How many times does the team that wins the first game of a series, wins the series?
Remember these question? Let’s dive in.
Tony Gwynn - San Diego Padres
What is a series, Luis, I can hear you ask. Good question. Major League Teams play 2-4 games in a row againts the same team, a way to get a few games in and saving transportation cost. A series, therefore, is those 2 to 4 games. Most of the time a series is 3 games long. What’s considered a series win? This can only happen in a series of 3 games, in which case the series winner is the team that wins 2 of the three games. In case of series of 2 and 4 games, best you can do is a tie, otherwise you lose the series. So maybe a corollary question may be in a given season, say 2025, how may 2, 3, and 4 game series does a teamp play?
For the sake of this excercise, we are going to ignore series where one or more games were postponed and therefore rescheduled to a future date, past the number of the days the teams were scheduled to play. Makes sense? If teams had a three game series scheduled, but the last game is postponed because of rain, and the following day one or both teams are scheduled to play some other team, then than game will be rescheduled weeks, month later depending on everyone’s scheduled. For now, we will skip those, and will come back to them at a later date. That will an interesting issue to handle.
Series information I think we need, and remember, more is better, is:
Date of game
Stadium
Teams
Score of game
Why the date? Because a series, sticking with 3 game series for now, is three games played in a row. So knowing the date, I can make sure I am looking at three games in order, with the same teams.
Why do we need to know the stadium. We don’t need it to answer the original question. But what if we would like to know, does a team win series more often when they at home than away? One would think so, but the data may tell us different. Again, more is better than less.
Alrighty then…second question:
Hall of Famer Rod Carew was a lifetime .328 hitter and played from 1967 to 1985. My question is what was his batting average each decade he played?
This one is actually not that complicated, just takes a bit more data. By definition, AVG = Hits / At Bats. Let’s break it down, but lets go a little higher., by starting with a plate appearance.
A plate appearance is defined as every time steps to the plate and faces the pitcher. However, a plate appearance does not equal an at bat. Let me clarify.
In order for a plate appearance to be considered an at bat, and therefore be used for the average calculation, only the following results of the plate appearance are to be included:
Player gets a hit (any kind)
Player is out, excluding sacrifice flies or bunts.
Player reaches base on an error
So, based on the above we need the following data, per game:
Date of game
Plate appearances
Result of the plate appearance:
Hits
Walks (remember, more is better)
Reach on error
Out, including sacrifice flies and bunts
Hit by pitch
Well, that’s it. This is all the data we are going to need to answers both of these questions. Next time, I’ll begin to show how to get the data added to Postgres. That’s the simple part. After that, we’ll start looking at the data itself.
Until then, please be kindn to one another.
Luis