Recalculated CA ratings
2016-01-05 16:20 GMT
CA ratings were recalculated today. They include 3v3 matches and use a new metric to evaluate each player's performance in a match:
damage_dealt/100 + 0.25*kills is used to order all players (regardless of teams) and decide pairwise winner, loser or a draw within a 2pt margin.
Also, for any new matches, the number of rounds played per player is counted and used instead of time_played to scale up partial play time in a match.
CA ratings were recalculated today. They include 3v3 matches and use a new metric to evaluate each player's performance in a match:
damage_dealt/100 + 0.25*kills is used to order all players (regardless of teams) and decide pairwise winner, loser or a draw within a 2pt margin.
Also, for any new matches, the number of rounds played per player is counted and used instead of time_played to scale up partial play time in a match.
Also, if teams are unbalanced this will disproportionately punish those in the lower-ranking team: it's easier to do damage when your teammates are skilled. Example: 2 players have elo=1800. one is in "strong" team, the other is in "weak". it will be more difficult for the player in the weak team to do damage, hence the corresponding elo will go down when in fact it should stay the same.
I would advocate the use of a ranking system that rewards teamplay in teamgames. I think qlranks used to do something like this and I think it worked better. Today what I see is lower-ranked players try to get better elo by waiting the end of rounds, and higher-ranked players rail-camping until they are last so they can do tons of damage.
If a team-based elo calculation is not possible, I would go for a within-team elo system, meaning that my scores get compared to those in my team and not those of the other team.
--NLKM
Seriously, if you hate +back so much just play a different gamemode or a different game altogether.
This system is still flawed, but it is much better than the way qlranks was doing it. PredatH0r, is this now comparing each player to every other player on the server as 1v1 during the match? if not how is it determining which players to compare. I feel it should be comparing every player to every other player in a 1v1 case per match, I know mplayer's elo system used to do that way back in the day, but not sure what you mean by "is used to order all players (regardless of teams) and decide pairwise winner, loser or a draw within a 2pt margin." is it just comparing 1 person to another random person in that match, or is it comparing all players to all other players (which imo is the way to do it to be most accurate)
For the remaining players a performance value is calculated, then the values of all players are compared to sort the list. You basically "beat" everyone below you in that list and lost to everyone above you in the list. A multiplayer match is treated like n*(n-1)/2 duels. For this calculation teams are completely irrelevant.
the only prob with that is, If I play on i.e. Nicer Honks server every night, and 11 other people are at 2000+ elo, and I am at 1400elo, I am rarely going to keep up total score with those guys, even if I have a "great game" for me. i.e. I a good game for me might be 5k dmg, positive k/d, 40%overall acc...I will still finish 10th-12th on the server cause those guys are doing 7k-8k dmg.
So even if I play well but not as well as these better players, my elo goes down. Does it take into account that I have a 1400elo and the other 11 guys are 1900, 200, 2200 elo?
Btw PredatH0r, thank you very much for doing this, and I'm not complaining, just trying to figure out how it all works. Your elo system works pretty well, we've been having more balanced games now, less 10-1 blowouts . Thank you for all your time and effort, many many of us appreciated it!!
The amount of points gained or lost depends on the outcome (win/loss/draw), your rating before the match, everyone else's rating before the match, everyone elses RD before the match and your own RD.
Low RD values (like 30) mean that the system is confident about the current rating and will only adjust it slightly, even for unexpected outcomes. This aspect of glicko is actually something that worries me. Once your RD is low, you get stuck around your rating and won't gain much by beating severely better players.
If you get beat by players with a higher rating, it also tends to keep your loss small.
But there is no simple rule-of-thumb. The Glicko update formula is quite complicated and with so many input factors it's hard to generalize things. Too many dependencies between the various factors.
When you only play on one server where most people are better than you, you will have a low rating.
If someone else plays only on another server, where he beats most players, his rating will be higher. As long as the 2 communities stay on their servers, those ratings are fine for shuffling on those servers. But the ratings are not comparable to eachother. Only when everyone plays everyone else, the ratings are comparable.
That's why it is possible for mediocre players to have a high rating or good players to have a low rating. They don't provide enough input to the system to be properly rating. The more players you play, the more accurate the rating is. Playing the same 10 player 100000 times won't do any good.
Yes I have noticed that a significant win with good KDR will still result in a negative glicko. When I finally get a large + glicko I find that I have to score 2/3K more than others in the server.
The more players you play, the more accurate the rating is. Playing the same 10 player 100000 times won't do any good.
This is very hard to avoid in Australia when we only have enough for 2 servers so getting a massive kill death ratio will be the only way to significantly jolt into higher elo? I have seen some players who are quite average in skill, gain 400+ elo in a single map, even though they were mid table KDR and points :/
luck of the irish
You don't get completely stuck. I keep a minimum RD of 30 atm, which still leaves room for reasonable changes.
The problem is that the initial RD of 350 allows changes that are way too large. When I recalculate the data the next time, I'll most likely lower this or add some other measure to prevent severe overshooting (e.g. a 1500 player with RD 350 beating 5 players with 2200 could gain 1200 in a single match)
What is the minimum number of rounds? And am I just a victim of anecdotal, I notice when it matters to me?
the "perf" value is scaled up from your played rounds to the total rounds, so on paper the number of rounds you played should not matter.
people will change behavior as soon as they know they are being observed - and that change is quite often based on wrong assumptions than on actual facts. even if i spent the rest of my lifetime tweaking this system, the amount of criticism would stay the same.
i wanted to create a system for shuffling teams better than QL's 50/50 random generator. for that the ratings are good enough. beyond that, i don't care much.
And don't get me wrong, I think you have done a good job. I think in general, CA glicko's (given time) are a decent representation of someone's expected performance, and usually result in decent teams. There are many caveats on that last point, but they pretty much all rely on people acting as they should - ie. playing their best, not quitting, not sooking, not increasing teamsize mid game and stacking teams, etc etc.
Keep up the good work.
kiss hug,
Anonymous
you can end up first on scoreboard or 2nd win the round and lose elo against players with average 400 + elo than what you have
it doesn't work at all, the new rating
used to play alot higher in terms of elo but how can you even consider climbing if you #1 on scoreboard and still lose elo for that?
Dexxa
With the QLRanks method there was always a rule of diminishing returns. It would be: Start out, win lots, get high ELO, then the ranking system would always put you on the worst possible team because it automatically thinks you're a one man super hero, you lose, then your ELO stays the same forever. You might say 'well that's just your level' but what the systems don't understand is how QL works (especially in CA) with player mixes. It's far to ignorant to just match ELO vs ELO, because one mans ELO is different to another and was obtained with different styles of play and tactics.
For instance you could have 4v4: 3x 1800elo aggressive style 1x 1800elo defensive style - vs - 3x 1800elo defensive style and 1x 1800elo aggressive style. This will cause completely different outcomes which have nothing to do with their rating, which is why I think ELO in Quake is fundamentally flawed.
Sometimes QLStats shuffles are very odd, but seem a whole lot better than QLRanks.