Help Me Workshop A New Scoring System

Anything about the Diplomacy game in general.

Help Me Workshop A New Scoring System

Postby jay65536 » 18 Sep 2018, 21:29

So awhile back, we had an active thread about scoring systems currently in use, and I made mention of the fact that I was working on a new system--not just a new system, but a new kind of system. I've gotten it to the point where I have tested it by re-scoring a tournament that was run on this site (the PlayDiplomacy top league). I thought maybe I would post it here and see if I could get any feedback, or points that I might have overlooked.

My system is an attempt to make a hybrid of draw-based scoring, which a lot of the online community seems to prefer religiously, and lead-based scoring, which is very popular in FtF because it's shown to make for more fun games in that setting. Here is how it works:

-Each game is worth a total of 252 points. If you lose, you get 0. If you solo, you get 252. If you get a 2way, you get 126.
-In any other draw result, scoring consists of two steps: (1) We first compute an "advantage number" (A) for the board, and whoever tops the board gets a fixed score based on the value of A. (2) After the points are given out to the table-topper, all other draw participants split all remaining points equally.

The only other details of this system are how to compute A--as a function of the top two center counts. Here is how it works:
1. If the top two center counts are within 1 of each other--this of course includes ties--then A=0 and the 252 points are equally shared among all draw participants. (For everything below, we assume that the table-topper leads by 2 or more centers.)
2. If the table-topper has 7--9 centers, the table-topper gets 63 points (i.e. one-fourth of the points).
3. If the table-topper has 10--12 centers, the table-topper gets 84 points (i.e. one-third of the points).
4. If the table-topper has 13--17 centers and no one else has more than 11, A is equal to the top center count minus 12.
5. If the top two center counts are both over 11, A is equal to the (positive) difference between the top two center counts, minus 1.

In both of the last two cases, the table-topper's score is 84 + 6A.

That's pretty much it.

Examples:

-12/11/11 3way draw is 84 points per player.
-15/14/5 3way draw is 84 points per player.
-14/10/10 3way draw gets scores of 96/78/78 (14-12 = 2, 12+84 = 96, then 252-96=156, split 2 ways).
-17/14/3 3way draw gets scores of 96/78/78 (because 17-14-1 = 2, same as above).
-12/8/8/6 4way draw gets scores of 84/56/56/56 (because 168/3 = 56).
-15/7/6/6 4way draw gets scores of 102/50/50/50.

Since I imagine this system requires tiebreaks, here's what I thought to use:
1 - Most solos
2 - Most games outscoring your opponents
3 - Most games topping the table
4 - Highest-scoring draw result (all the way down the list if necessary)
5 - Highest center count in best-scoring draw (all the way down the list if necessary)

Any questions or feedback, that's why the thread is here! Interested to hear what people think.
jay65536
 
Posts: 412
Joined: 10 Sep 2016, 18:13
Class: Ambassador
Standard rating: 1105
All-game rating: 1111
Timezone: GMT-5

Re: Help Me Workshop A New Scoring System

Postby NoPunIn10Did » 18 Sep 2018, 23:03

Are the parameters you have here for calculating A something that could be translated more generally (via percentages)? Or is this scoring system only meant for vanilla Dip only?

Do people really like draw-sized scoring (DSS) where the scores aren't equal? I know Dixiecon uses a system like that, but it's an artifact of sorts. I thought the appeal of DSS in the modern era was mostly related to keeping the "all in a draw are equal" business intact.
NoPunIn10Did
Lead Volunteer Developer

Forum Administrator

Variant GM & Designer
User avatar
NoPunIn10Did
Premium Member
 
Posts: 2423
Joined: 17 Aug 2011, 00:17
Location: North Carolina
Class: Ambassador
Standard rating: 1000
All-game rating: 1501
Timezone: GMT-5

Re: Help Me Workshop A New Scoring System

Postby NoPunIn10Did » 18 Sep 2018, 23:06

This also creates a perverse incentive to throw SCs to the second-place player. I'm not sure any other system exists where losing SCs to another player can directly increase your score.
NoPunIn10Did
Lead Volunteer Developer

Forum Administrator

Variant GM & Designer
User avatar
NoPunIn10Did
Premium Member
 
Posts: 2423
Joined: 17 Aug 2011, 00:17
Location: North Carolina
Class: Ambassador
Standard rating: 1000
All-game rating: 1501
Timezone: GMT-5

Re: Help Me Workshop A New Scoring System

Postby mjparrett » 19 Sep 2018, 00:58

If the PDL taught me one thing, it is that you won't get consensus on something like this. And generally, if it ain't broke, don't fix it. The scoring system we have here is wonderful and works fine. So let us leave it be. That is not to say yours potentially isn't. But you'll have as many people as currently like/dislike the system in place. Everyone is treated the same by the system and there are no injustices that need solved. So what is the problem?
mjparrett
 
Posts: 367
Joined: 01 Mar 2017, 20:05
Location: Scotland
Class: Star Ambassador
Standard rating: 1488
All-game rating: 1524
Timezone: GMT

Re: Help Me Workshop A New Scoring System

Postby GhostEcho » 19 Sep 2018, 03:27

I'm happy to help think things through, but I think in order to offer improvements it's necessary that people have a clear idea what your goals are - what kind of play to encourage/discourage, what optimal results under certain conditions are (e.g. I've also been fiddling with my own idea for a while now, but I'm stymied by trying to fit it to the realities of tournament play).
"When you absolutely don't know what to do any more, then it's time to panic." - Johann van der Wiel
"I'm not panicking, I'm watching you panic. It's more entertaining." - Elli Quinn
"[Diplomacy:] No dice or chance. Just calculated insincerity." - Counter Trap
User avatar
GhostEcho
Premium Member
 
Posts: 1839
Joined: 10 Aug 2008, 04:56
Location: Baltimore
Class: Ambassador
Standard rating: 995
All-game rating: 957
Timezone: GMT-5

Re: Help Me Workshop A New Scoring System

Postby NoPunIn10Did » 19 Sep 2018, 04:23

If the goal is to hybridize draw-size scoring with a lead-based system, why not just score a draw as the average of DSS and SOS (both expressed as a value between 0 and 1).
NoPunIn10Did
Lead Volunteer Developer

Forum Administrator

Variant GM & Designer
User avatar
NoPunIn10Did
Premium Member
 
Posts: 2423
Joined: 17 Aug 2011, 00:17
Location: North Carolina
Class: Ambassador
Standard rating: 1000
All-game rating: 1501
Timezone: GMT-5

Re: Help Me Workshop A New Scoring System

Postby super_dipsy » 19 Sep 2018, 07:00

Jay, I applaud the intent to try to develop a synthesis of different systems. I always feel this is a good way to get an overall benefit :)

I guess the thing that leaps out at me with your proposal though is that there is no recognition of player strengths. That may well be intentional, but it is one of the most common threads we have seen over the years to ALL discussions on scoring. People like to feel that if they do well against 'strong' players they should benefit more than against 'weaker' players. Did you leave this out intentionally? If so, then I guess this is an attempt to develop a new tournament scoring system rather than a general scoring system like we have today? It might be worth clarifying that in your title for the thread to avoid confusion ;)
User avatar
super_dipsy
Premium Member
 
Posts: 12048
Joined: 04 Nov 2009, 17:43
Class: Ambassador
Standard rating: 1000
All-game rating: 941
Timezone: GMT

Re: Help Me Workshop A New Scoring System

Postby NoPunIn10Did » 19 Sep 2018, 15:28

super_dipsy wrote:Jay, I applaud the intent to try to develop a synthesis of different systems. I always feel this is a good way to get an overall benefit :)

I guess the thing that leaps out at me with your proposal though is that there is no recognition of player strengths. That may well be intentional, but it is one of the most common threads we have seen over the years to ALL discussions on scoring. People like to feel that if they do well against 'strong' players they should benefit more than against 'weaker' players. Did you leave this out intentionally? If so, then I guess this is an attempt to develop a new tournament scoring system rather than a general scoring system like we have today? It might be worth clarifying that in your title for the thread to avoid confusion ;)


Since this is a fixed-sum system, it wouldn't be hard to numerically transform it to something compatible with our current Elo setup. However, I think this rating system isn't meant for our website generally, but rather as an option for limited engagements like the league.
NoPunIn10Did
Lead Volunteer Developer

Forum Administrator

Variant GM & Designer
User avatar
NoPunIn10Did
Premium Member
 
Posts: 2423
Joined: 17 Aug 2011, 00:17
Location: North Carolina
Class: Ambassador
Standard rating: 1000
All-game rating: 1501
Timezone: GMT-5

Re: Help Me Workshop A New Scoring System

Postby boldblade » 19 Sep 2018, 15:52

Why is six the multiplier in cases 4 & 5?
Why break up the scoring at 7-9, 10-12, 13-17 centers as opposed to making it just two groups or anything else?
Why minus 1?
boldblade
 
Posts: 338
Joined: 05 Feb 2014, 17:33
Class: Star Ambassador
Standard rating: (1474)
All-game rating: (1488)
Timezone: GMT

Re: Help Me Workshop A New Scoring System

Postby jay65536 » 20 Sep 2018, 20:16

Lots of good questions in here! I hope I didn't miss anything big...

To dipsy's point:

Since this is a fixed-sum system, it wouldn't be hard to numerically transform it to something compatible with our current Elo setup. However, I think this rating system isn't meant for our website generally, but rather as an option for limited engagements


This is pretty much exactly what I would have said.

If the goal is to hybridize draw-size scoring with a lead-based system, why not just score a draw as the average of DSS and SOS (both expressed as a value between 0 and 1).


Because I want a system that's easier to understand than something like that (which piece of the average should a player prioritize more? Who knows?) and that doesn't make the statistically dubious step of averaging percentages (both Calhamer points and SoS scores are percentages). In general, I'm trying to make an actual hybrid, not a composite. (Also, SoS is a hybrid, not a lead-based system.)

I'm happy to help think things through, but I think in order to offer improvements it's necessary that people have a clear idea what your goals are - what kind of play to encourage/discourage, what optimal results under certain conditions are (e.g. I've also been fiddling with my own idea for a while now, but I'm stymied by trying to fit it to the realities of tournament play).


I'm not 100% sure this will answer your question--maybe you can be more specific?--but, this system is my attempt to try to square away the alleged theoretical benefits of draw-based scoring, with the experienced drawbacks.

The main ethos of draw-based scoring is that you should get more points the more opponents you've beaten, but anyone who hasn't been eliminated (or voted themselves out, in the noDIAS case) doesn't count as "beaten". The alleged benefit of this idea, from a standpoint of making games more fun, is that small powers can fight for an equal share of the draw and they will not give up. However, in practice, draw-based systems lead to MORE positions refusing to fight, not fewer--it creates an incentive for the 2 or 3 biggest powers to collude against the little guys and make their positions more hopeless than if the biggest powers were fighting each other.

The main thing I was trying to accomplish in my system was to stay as close to equal draw sharing as I could, while also putting in a twist that removes the table-topper's incentive to kill off small powers. In other words, imagine a situation where the center counts are 13/9/9/3. In a typical draw-based system, all 3 powers at the top would have an incentive to kill the 3-center power, with very little regard for who gets the centers. In my system, as long as no one else gets up to 12 centers, the 13-center power's score does not depend on the draw size--so (I hope) there is more of a reason for the 13- and 3-center powers to negotiate with each other and potentially work together. Also, if the game ended with those center counts, the scores would be 90/54/54/54--so the 9-center powers have a vested interest in killing off the little guy to improve their scores, which creates more of a reason for the top powers to compete with each other.

Because top powers competing with each other and working with smaller powers is how solos often happen, I would hope that my system would also be better at encouraging people to play for the win than a straight draw-based system.

Now, of course, if you don't like the idea of equal draw sharing, even in the abstract, this isn't the system for you--but there are lead-based and center/lead hybrid systems out there already, many of which I do like.

This also creates a perverse incentive to throw SCs to the second-place player. I'm not sure any other system exists where losing SCs to another player can directly increase your score.


This is definitely something I've thought about and could be considered a weak point in my system. I guess I'd say 2 things to that:

1. This is the kind of thing that may be a theoretical weak point but that, in practice, wouldn't happen much.
2. If it does happen, how bad is it anyway?

Obviously my system has not really been tried yet, so can't speak to #1, except to say that draw-based systems can have this problem too. This actually happened in a game I've played here: the center counts were 15/8/7/4. The 15-center Germany was asking me--the 4-center Turkey--and Austria to stab Italy and make a 3way draw. I told him I would not stab unless he lessened his solo threat by handing centers to Austria. So that's a situation where someone needed to lose centers in order to increase their score. In practice, though, Germany refused and took a 4way. I wonder how many theoretical examples of throwing centers to 2nd place would be refused in real time?

And that leads into point #2. I can imagine a situation where it's 13/11/10, and the 10 gives up a center so it's 13/12/9 and the draw is equally shared. To that I'd say, if the 13-center power can't prevent that, then probably he doesn't deserve a situation where the draw is not equally shared, right? But on the other hand, imagine it's 15/12/7. The 7-center power could improve their score by giving centers to the 12-center power--but that situation is playing with fire! If it's 15/13/6 or 15/14/5, the small power is now risking being squeezed out to a 2way. And if the small power can successfully navigate that situation, then perhaps that could be argued to be enough of a skill display that it's worth the increase in score. One of the main ideas of equal draw sharing is that center counts don't really matter as long as you're in the draw; this seems to be in line with that. Again, if you don't like equal draw sharing, not the system for you.

Are the parameters you have here for calculating A something that could be translated more generally (via percentages)? Or is this scoring system only meant for vanilla Dip only?


Not via percentages, but I believe the thought process behind the formula for A is easily generalized--though I admit I didn't have variants in mind while thinking of this. PM me for more details if you want (and if what I write below doesn't help answer this question).

Why minus 1?


To make it consistent with the fact that center counts within 1 of each other doesn't count as an advantage. In other words, a lead of 2 is the smallest that "counts", so I subtract 1 when counting how big a lead it is.

Why break up the scoring at 7-9, 10-12, 13-17 centers as opposed to making it just two groups or anything else?


In a typical draw-based system, the gold standard of a "good result" is a 3way draw; in my system, an equally shared 3way draw is worth 84 points. In terms of center counts, though, an "equitable" 3way distribution is 12/11/11 or else 12/12/10. This is why I fixed the score for 10--12 centers with an advantage at 84 points--so that that way, once you hit 10 centers and you're topping the board, you can't gain points by draw-whittling without gaining centers. At least, that's why it's not less than 84. It's not more than 84 because I wanted to ensure that in my system, you don't get more than 84 points unless you get more than 12 centers--i.e. more than the number of centers you could have in an "equitable" 3way draw. So that accounts for why 10--12 is one group and 13--17 is another.

For a similar reason, 7--9 with an advantage is fixed at 63--the point value of an equally shared 4way--so that you can't gain points by draw-whittling without gaining centers, unless you want to allow someone else to get 2+ centers ahead of you and risk triggering the situation in the above paragraph. (There could also be a rule that you can't vote a 3way draw until someone reaches 10 centers, which I think would be REALLY nice to have with this system.)

Why is six the multiplier in cases 4 & 5?


I mean, I tried to pick all the numbers to simplify how the results looked and to make the different results "scale" in sort of a balanced way. For example, 252 was chosen because it's close to a round number (250) and divisible by lots of small numbers. And then 6 was chosen because, in addition to the fact that it's divisible by 2 and 3, I think when you use 6 as the multiplier, you get balanced numbers (balanced between the "draw" part of the system and the "lead" part). This way, if you're dominating your board, you can get somewhat close to a 2way, and it's a big enough number that a 2way isn't such an insurmountable result anymore; but it still takes more than 2 such results to outscore a solo (which is different from SoS).

I can try to give some examples if you want but maybe that's a good enough explanation?...
jay65536
 
Posts: 412
Joined: 10 Sep 2016, 18:13
Class: Ambassador
Standard rating: 1105
All-game rating: 1111
Timezone: GMT-5

Next

Return to Diplomacy Lore

Who is online

Users browsing this forum: No registered users and 1 guest