PICture 'RPI' Engine - Tutorial

-teams.txt

INPUT - The name 'teams.txt' is the default name for the 'key' (operative word here) name list of the teams for which you desire ratings. We will call it the teams file. The program can prompt for you to input any named file you want. For example you could have a file named "teams.txt" containing all the teams' names in your division and another file named "south.txt" containing just the teams' names in the 'south' division, etc. As you will read later, this system will allow you to maintain a 'master' scores file for the scores but still control the program output with this simple text file. The program will only consider teams that are named in, and play teams that are named in, your teams file. Another 'side effect' of this system is that it will catch lexicographical differences, spelling errors, in teams' names. Since the teams' names are used to instantiate the containers, each of these names must be unique. The program will display an error if a team is named multiple times in the teams file. You may indeed means two different teams but each must have a unique name. Similarly, it will catch lexicographical differences in the scores file. The maintainer (you!) of the scores may have made a 'typo' in the team name field on the spreadsheet. Any name mentioned in the scores file, but not in the teams file will be logged alphabetically, to the "checkme.txt" file. You can check that file to see that it contains only teams' names that you did not want included in the output. The format if this teams file is simple text, one team name to a line. You may, for readability, have blank lines in the file, and any line that starts with a semi-colon <;> is skipped. You can use such a line for comments.

Consider the old MacDonald's farm teams file 'farmtms.txt'. Yes, the pun was intended.

Cows
Pigs

Roosters
Hens
Chicks
Egg

Maggots

Cats
Dogs

We are interested in the birds division so we 'Cut 'n' Paste' the names...

Roosters
Hens
Chicks
Eggs

;Maggots

;Cats

...to a file we will call 'birdtms.txt'. Note that the names Maggots and Cats are commented out. We will use them later. If we maintain the master file 'farmtms.txt' with a spreadsheet/database the we can use a lookup function to put valid names into the master scores file. I'll leave the operation of the spreadsheet/database to the user.

-scores.txt

INPUT - The name "scores.txt" is the default name for the comma <,> delimited file containing your for and against information, Basically it is who played who and who won.

At MacDonald's farm, in the birds division, we already know the pecking order. The Roosters can crow about besting every other team in their division. The Hens are laying in second place, winning over all other teams in their division save for the Roosters. The Chicks can't beat any team but the Eggs. And the Eggs, well, they have yet to come out of their shell and scramble for a victory. Get the yolks?? Heh... it is late.

Our master scores file 'farmscr.txt' contains data that looks looks something like this...

, Dogs, 10, Cats, 4, BM

; These next lines are pre-season scores.
; We don't want to count then in the calculations.

;week 0, Horses, 14 ,Bulls, 12, BM
;week 0, Bears, 12, Dogs, 10, BM

;week 0, Roosters, 1, Eggs, 0, BM
;week 0, Chicks, 1, Hens, 0, BM

; Start of regular season...

week 1, Roosters, 1 ,Hens, 0, BM
week 1, Chicks, 1, Eggs, 0, BM

week 2, Roosters, 1, Chicks, 0, BM
week 2, Hens, 1, Eggs, 0, BM

week 3, Roosters, 11, Cats, 6, BM
week 3, Maggots, 4, Eggs, 1, BM

week 4, Roosters, 1, Eggs, 0, BM
week 4, Hens, 1, Chicks, 0, BM

, Bears, 6, Bulls, 3, BM

...and so on. Note the use of the semi-colon <;> to comment lines that we don't want to process. Note that you can use the actual scores or the binary 1, 0 system to indicate winners and losers. Blanks lines are fine, so long as they are truly blank. Note that there is no provision for comment lines in the scores file, but comments can be inserted after the second team's score and its trailing comma. No hidden tabs or anything. Blanks before or after names are 'eaten' by the system. Blanks embedded in a name are preserved. The first field, in this case 'week ?' is purely for your use in keeping track of your data. it is not used by the program. If you decide not to use this management aid then be sure to start each line with the comma <,> delimiter. Only the next four fields, a pair of name, score fields are significant. Note that order of winners and losers is irrelevant. There is no significance to the order of these pairs. You may decide to list the 'home' team name<,> and their score<,> followed by the 'visitors' team name<,> and their score. You may follow that second score by a comma <,> and any other field(s) you want to include. Here in our example I have listed the pairs by 'winners' first and 'losers' second. That way I might see at a glance any errors?? For example if I see 'Eggs' in the first significant field a flag goes up because I know the Eggs never won a game. Your decision though. Here in our example I have used an extra field, the score keepers initials.

-picture.exe

EXECUTABLE - We will execute the program, for this tutorial, with the default arguments.

Usage : PICture [([-/][gG]) ([-/][iI]) ([-/][rR]#) ([-/][sS]#)]
No argument default.
Calculate 'my' opponents' average WP by "Teams".
Exclude 'me' from 'my' opponents' average WP.
Calculate 'my' RPI with 25% of 'my' WP,
and (100 - 25) 75% of 'my' Strength of Schedule.
Calculate 'my' SOS with 67% of 'my' opponents' average WP,
and (100 - 67) 33% of 'my' opponents' opponents' average WP.
The default is the (25% WP + 50% OWP + 25% OOWP) RPI model.
or argument ([-/][[gG]), PICture -g
Calculate 'my' opponents' average WP by "Games".
and/or argument ([-/][[iI]), PICture -i
Include 'me' in 'my' opponents' average WP.
and/or argument ([-/][rR]#), PICture -r20
Calculate 'my' RPI with 20% of 'my' WP,
and (100 - 20) 80% of 'my' Strength of Schedule.
and/or argument ([-/][sS]#), PICture -s50
Calculate 'my' SOS with 50% of 'my' opponents' average WP,
and (100 - 50) 50% of 'my' opponents' opponents' average WP.

Simply type the program name, PICture, hit 'Enter', and the "splash screen" and a menu will be presented...

Copyright: Signal Computing Service Ltd.
All Rights Reserved.
Programmer: Brian MacBride
E-mail: macbride@shaw.ca
Web page: http://members.shaw.ca/macbride/
Compiled: Fri Feb 25 10:11:06 2000.

--> PICture <--

1 - Teams input file -> teams.txt
2 - Games input file -> scores.txt
3 - RPI output file -> rankRPI.txt
4 - SOS output file -> rankSOS.txt
5 - Process rankings
6 - Exit program

Please enter your selection (1 - 6)?

Note, the string between the --> and <--. This will be the name of the program and any optional arguments you input. If the default input and output filenames are not the files you wanted, then input the option number and 'Enter' and you will be prompted for a new filename. If the filename is that of an an input file and that file doesn't exist you will be warned.
...
5 - Process rankings
6 - Exit program

Please enter your selection (1 - 6)? 1

Enter "teams" filename? anewfile.txt

Can't open input file "anewfile.txt"!

1 - Teams input file -> anewfile.txt
2 - Games input file -> scores.txt
...
When all the filenames are to your satisfaction then input '5' and 'Enter'. The program will first read the teams file. If a duplicate is found the program will stop...
...
5 - Process rankings
6 - Exit program

Please enter your selection (1 - 6)? 5

Reading "birdtms.txt" data...

Duplicate team. "Horses" in 'teams' file!

1 - Teams input file -> birdtms.txt
2 - Games input file -> farmscr.txt
...
...and you will have to resolve the duplicity?

If the teams file is read OK, then the scores file is processed...
...
5 - Process rankings
6 - Exit program

Please enter your selection (1 - 6)? 5

Reading "birdtms.txt" data...
Creating the "teams" database...
Reading "farmscr.txt" data...

Missing teams are logged in "CheckMe.txt"!

Calculating...
...teams' "Winning percentages"...
...opponents' "Winning percentages"...
...opponents' opponents' "Winning percentages"...
...indexes for "RPI" and "SOS" rankings...
Sorting "RPI" rankings...
Sorting "SOS" rankings...
Writing "rankRPI.txt" data...
Writing "rankSOS.txt" data...

1 - Teams input file -> birdtms.txt
2 - Games input file -> farmscr.txt
...
... while displaying the progress of the calculations. If there are teams in the sores file that are not in the teams files, you hear the 'beep' and see the message, Missing teams are logged in "CheckMe.txt"!

-checkme.txt

OUTPUT - This file contains the names of mentioned in you scores file that were not in your teams file. Usually this is by intent as you only wanted ratings for a subset of the full score set. It could indicate an error though. The 'missing teams' are logged alphabetically but only once even though there may be many scores involving them. With our farm analogy, you scan this file and see 'Huns'?? You know that there are no teams named the 'Huns' but you do have a teams named the 'Hens'. You have probably just trapped a keypunch error in your scores file. Someone (you??) has entered 'Huns' when they meant 'Hens'. Of course, using your spreadsheets lookup feature will catch these things before they happen and are recommended.
Note that this file also contains an alphabetical listings of each team, its record and each of their opponents, and their opponents record.
Using the teams file and scores file from above should leave something like this in the "checkme.txt" file...

These teams are NOT in the 'teams' file.
Any game they may have played in are NOT used
in the RPI or SOS ratings calculations.

Bears
Bulls
Cats
Dogs
Maggots

Alphabetical list of Teams, Games, Wins
And their opponents... Teams, Games, Wins

Chicks,3,1
Eggs,1,0
Hens,1,1
Roosters,1,1
Eggs,3,0
Chicks,1,1
Hens,1,1
Roosters,1,1
Hens,3,2
Chicks,1,0
Eggs,1,0
Roosters,1,1
Roosters,3,3
Chicks,1,0
Eggs,1,0
Hens,1,0

...in the 'checkme.txt' file.
It is also possible to have team names in the teams file that have no corresponding games in the scores file. In that case no warnings are given but they will show up at the bottom of the ranking files, tied for last.
Bear in mind that if you try and marry up your last year's tax file and your wife's address list then the output files will contain meaningless junk. If the output files look unrecognizable, then one or more of the above syntax rules have been broken, More than likely a missing or extra comma <,> somewhere. Be aware that because the scores file is comma <,> delimited, that file should not contain any user inserted commas in the teams' names or comment fields.

-rankRPI.txt

OUTPUT - The comma <,> delimited results of the above data, sorted by RPI ratings for the birds division at old MacDonald's farm. In raw form it looks like...

1,Roosters,0.625,3,3,1.000,0.500,0.500,0.500,1
2,Hens,0.542,3,2,0.667,0.500,0.500,0.500,2
3,Chicks,0.458,3,1,0.333,0.500,0.500,0.500,3
4,Eggs,0.375,3,0,0.000,0.500,0.500,0.500,4

...the above. Import this file into your spreadsheet for easy reading and we'll discuss the RPI results. The SOS results will be discussed in the next section.

# Team RPI G W WP OWP OOWP SOS #
1 Roosters 0.625 3 3 1.000 0.500 0.500 0.500 1
2 Hens 0.542 3 2 0.667 0.500 0.500 0.500 2
3 Chicks 0.458 3 1 0.333 0.500 0.500 0.500 3
4 Eggs 0.375 3 0 0.000 0.500 0.500 0.500 4

No real surprises here. The layout is pretty self explanatory. Note that I am displaying Games and Wins rather than wins and losses. This makes the program more compatible with sports that allow ties. Note that any ties in the RPI will be resolved by using the SOS as the secondary key. In the rare instance where the tie is still not resolved the results are sorted alphabetically but are given the same ranking. #1, #2, #3, #3, #5, #6,...

Insert the names 'Cats' and 'Maggots' into your birdtms.txt file (or remove the <;> comments indicator from above) to include the games played in 'week 4, by these new extra-division opponents against the birds division.

Roosters
Hens
Chicks
Eggs

Maggots

Cats

Run the program and now our data shows...

# Team RPI G W WP OWP OOWP SOS #
1 Cats 0.595 1 0 0.000 1.000 0.375 0.794 1
2 Roosters 0.585 4 4 1.000 0.375 0.594 0.447 4
3 Hens 0.521 3 2 0.667 0.500 0.417 0.472 2
4 Chicks 0.438 3 1 0.333 0.500 0.417 0.472 3
5 Maggots 0.343 1 1 1.000 0.000 0.375 0.124 6
6 Eggs 0.274 4 0 0.000 0.375 0.344 0.365 5

...what we would expect?? Note that the 'Cats' who never won a game are rated #1 on the strength of schedule fact that they played and lost to a team with a perfect winning percentage. Similarly, the 'Maggots' who also have a perfect winning percentage are rated #5 because their opponent had the worst record of all the combatants.

-rankSOS.txt

OUTPUT - The comma <,> delimited results of the above data, sorted by SOS ratings for the birds division at old MacDonald's farm. In raw form it looks like...

1,Roosters,0.500,3,3,1.000,0.500,0.500,0.625,1
2,Hens,0.500,3,2,0.667,0.500,0.500,0.542,2
3,Chicks,0.500,3,1,0.333,0.500,0.500,0.458,3
4,Eggs,0.500,3,0,0.000,0.500,0.500,0.375,4

...the above. Import this file into your spreadsheet for easy reading and we'll discuss the SOS results. The RPI results were discussed in the previous section.

# Team SOS G W WP OWP OOWP RPI #
1 Roosters 0.500 3 3 1.000 0.500 0.500 0.625 1
2 Hens 0.500 3 2 0.667 0.500 0.500 0.542 2
3 Chicks 0.500 3 1 0.333 0.500 0.500 0.458 3
4 Eggs 0.500 3 0 0.000 0.500 0.500 0.375 4

Note that any ties in the SOS will be resolved by using the RPI as the secondary key. In the rare instance where the tie is still not resolved the results are sorted alphabetically but are given the same ranking. #1, #2, #3, #3, #5, #6,... If you are not familiar with the concept of Strength of Schedule you may wonder about all the teams having identical 500 strength of schedule. This is because the four teams have each played all their opposition teams exactly the same number of times each. Kind of a 'closed loop' thing. Also because they all have the same SOS value, their ties are resolved by their RPI. Note the coincidence that the average SOS, 500 is the same ratio as the total Wins/Team Games. This is what we would expect in a closed system. The ratio as the total Wins/Team Games will always be 500 if every game has a winner but the SOS is dependent on the outcome of the games.

Insert the names 'Cats' and 'Maggots' into your birdtms.txt file (or remove the <;> comments indicator from above) to include the games played in 'week 4, by these new extra-division opponents against the birds division.

Roosters
Hens
Chicks
Eggs

Maggots

Cats

Run the program and now our data shows...

# Team SOS G W WP OWP OOWP RPI #
1 Cats 0.794 1 0 0.000 1.000 0.375 0.595 1
2 Hens 0.472 3 2 0.667 0.500 0.417 0.521 3
3 Chicks 0.472 3 1 0.333 0.500 0.417 0.438 4
4 Roosters 0.447 4 4 1.000 0.375 0.594 0.585 2
5 Eggs 0.365 4 0 0.000 0.375 0.344 0.274 6
6 Maggots 0.124 1 1 1.000 0.000 0.375 0.343 5

...what we would expect?? Note that the 'Cats' who never won a game are rated #1 on the strength of schedule fact that they played and lost to a team with a perfect winning percentage. Similarly, the 'Maggots' who also have a perfect winning percentage are rated #6 because their opponent had the worst record of all the combatants.

-
 
Signal Logo PICture 'RPI' Engine by...
Signal Computing Service Ltd.
http://members.shaw.ca/macbride/
macbride@shaw.ca