Tier List based on Shuffle ladder

Hey there.

Since I had read several threads with nerf/buffs about the same classes today and wanted to test whether ChatGPT is suitable as a programming aid, I just thought I could write a program with AI support, which evaluates the shuffle ladder and presents specs according to strength.

For this the programm analyzed the top 100 players of each spec (except tanks and outlaw rogue). To determine the strength of the specs, it multiplied the average rating with the average winratio of each spec.

Here is the result, with the last two columns showing how far they are from the median:

Spec Avg Rating Win Ratio Spec Strength D2Median in %
druid/balance 2417.95 0.5638 1363.24 174.82 14.71
warlock/destruction 2331.3 0.5762 1343.3 154.88 13.03
monk/mistweaver 2303.31 0.5682 1308.74 120.32 10.12
shaman/elemental 2331.16 0.5471 1275.38 86.96 7.32
warlock/demonology 2305.47 0.5485 1264.55 76.13 6.41
priest/shadow 2363.35 0.5337 1261.32 72.9 6.13
rogue/subtlety 2272.57 0.548 1245.37 56.95 4.79
druid/restoration 2273.99 0.5467 1243.19 54.77 4.61
paladin/retribution 2343.4 0.529 1239.66 51.24 4.31
shaman/enhancement 2289.61 0.5412 1239.14 50.72 4.27
mage/fire 2281.27 0.5406 1233.25 44.83 3.77
priest/holy 2247.32 0.5446 1223.89 35.47 2.98
warrior/fury 2226.68 0.5474 1218.88 30.46 2.56
druid/feral 2272.37 0.5304 1205.27 16.85 1.42
warrior/arms 2266.29 0.5264 1192.98 4.56 0.38
monk/windwalker 2261.93 0.5254 1188.42 0.0 0.0
shaman/restoration 2208.09 0.5354 1182.21 -6.21 -0.52
hunter/survival 2227.94 0.5268 1173.68 -14.74 -1.24
evoker/preservation 2185.11 0.5357 1170.56 -17.86 -1.5
priest/discipline 2217.18 0.5247 1163.35 -25.07 -2.11
warlock/affliction 2183.84 0.5304 1158.31 -30.11 -2.53
demonhunter/havoc 2203.34 0.5191 1143.75 -44.67 -3.76
mage/frost 2152.52 0.5216 1122.75 -65.67 -5.53
hunter/marksmanship 2142.08 0.5225 1119.24 -69.18 -5.82
evoker/devastation 2099.72 0.5309 1114.74 -73.68 -6.2
hunter/beastmastery 2084.12 0.5268 1097.91 -90.51 -7.62
deathknight/unholy 2076.91 0.5135 1066.49 -121.93 -10.26
paladin/holy 2058.76 0.518 1066.44 -121.98 -10.26
rogue/assassination 2043.24 0.5079 1037.76 -150.66 -12.68
deathknight/frost 1933.18 0.5193 1003.9 -184.52 -15.53
mage/arcane 1905.22 0.5187 988.24 -200.18 -16.84

According to the data, WW Monk is at the median, meaning that 50% of the specs are performing better on the ladder while 50% are lagging behind.

I think that speaks for itself, but for better visualization, I’ve created a tier list based on this data which you can find here:

https://ibb.co/4NYPTYX

I’m actually surprised that the data almost perfectly matches my expectations. I would have thought that sub rogue and hunter would be higher, but I suppose the reason is due to random groups and the difficulties in setting up good setups that way. Also, I expected DH to be higher, but whatever.

The rest, however, was exactly what I expected and it clearly shows the current wizard meta and how powerful control + range abilities + magic damage (ret/enha) really are! To be fair though, we need a god tier for boomy since S tier isn’t even close enough to show its strength!!

What do you think about this kind of analysis? Do you find it helpful to see a more statistical and therefore less biased tier list? Let me know!

8 Likes

Arms, Feral and Holy Priest into the middle of A tier and it’s pretty spot on I think.

But then it is biased again. The tier list is simply based on 3100 players, means top 100 players from each spec and how they are doing in shuffle.

Which is only fair if you have them playing the same amount of games into the exact same matchups. My comment is based on fighting all specs up to and at 2.5+ mmr. Those specs are better than B tier.

Thanks for posting that
Loads of really interesting specs in the plotted table
cries in DH

2 Likes

what your analysis does not take into account is that the data is distorted because there were several buffs and nerfs to many specs so specs that were former s tier specs still have high representation even though they are not s tier anymore (destro for example).

The timeframe of the data is important when analysing the data.

dh is pretty solid and not C tier at all haha. Prevoker und Holypriest are A tier too in my opinion. Especially holy after recent buffs.

In general its a decent list though

2 Likes

Can already bet there will be statistics gurus arguing your logic :rofl:

Thanks for posting, very interesting and imo truthful and correlates with what Ive been experiencing in the game as well

Data says no
haha
Not that black and white but DH defensives are really questionable

It’s comfortably a B tier solo shuffle spec.

I have no idea what tier it is
I’m terrible at this game
Surprised I still have a full head of hair from Solo fiesta

Small Edit - I’d say B is probably around correct for RSS

I thought about that too, but then I thought that the mmr hotfix from 1.5 weeks ago would take that more than enough into account. But depending on how active the players of each spec have been since then, the data could be inefficient for some specs and represent only the current snapshot of the ladder, which need some time after hotfixes until they are reflected.

However, there were no patch notes that move a spec into a completely different tier, were there?

It might be that there is a huge gap between very good DHs and just solid ones, but I took the top 100 for a reason and not the top 20 for example, were the tier list becomes differently for some specs like Arms and Feral.

Maybe its better to take a cr limit, for example every player of a spec up from duelist rating? But then how to take into account the differences of numbers? Would it be fair to compare 350 boomkins with 100 dhs for example? Any suggestions to improve it?

To be fair, havoc is 3% behind median. Just a tiny bit more winratio or higher average rating and it would have been in low B tier.

Its not too bad, but I wanted to keep the bias out and just spot it the way I placed every other spec.

Depending how long it takes to feed the data
Do 1800+
2100+
2400+
Would this work ?

ahh I was just memeing lol
I genuinely expected to see it higher ngl

1 Like

It would definitely work, but it will take some time to do the coding since the current one only takes the first page of each spec (top 100). Based on ratings I need to modify it quite a bit and I don’t have enough time during the week, so maybe next weekend if I’m in the mood. :blush:

I see. :+1:

1 Like

above 1800 its pretty solid but it strongly falls down above 2200+ for example. Dont really know why tbh. When i play with dh they feel insanely strong haha.

But yes the data shows that its not that good at top ratings in shuffle so propaply not a tier. Defnitely not c tier though.

destro for example is way less represented compared to a few weeks ago.
Holy Priests are way more dominant now. Both definitely moved a tier.

not sure but propaply a time limit like how the ladder looks in the last 2 weeks or something but thats not possible with the data we have i think.

Unfortunately it is not. We can take the data again in 2 weeks and see how it moved then but I don’t have any data from 2 weeks ago, just the current ladder.

I still think the mmr hotfix would have done the work, as the average rating for the other specs should have become much higher compared to when destro got nerfed.

Yeah sounds quite reasonable. Above 2200 players probably know exactly how to punish DH so a small mistake = hold cheeks

When things go right I can see why people think they’re really good. Well timed Darkness and your burst dissapears. Blur dodging abilites etc
Demon proc Heal, really strong damage

My rating Idek what is going on, every game feels different with people doing random things (probably me included from bad habbits formed)

Well if you can be bothered and do end up doing it I’d be very curious. I’d understand if it never comes though haha

As an arms warrior I fully agree with this list.

I just took the top 10 player of each spec where we can somewhat be sure that the mmr hotfix will have been enough, since mmr shifted by a 150 points so far afaik and each spec should have had at least 10 players pushing in the past few days?

The result isn’t that different from what I would have expected. Boomy became even more god tier, but Destro is still S tier then. If the mmr hotfix wasn’t enough to compensate it, it must have been the only spec pre nerf that was able to reach 2500+ in average for the top10 player. :laughing:

Spec Avg Rating Win Ratio Spec Strength D2Median in %
druid/balance 2609.4 0.6272 1636.62 252.54 18.25
warlock/demonology 2527.4 0.62 1566.99 182.91 13.22
warlock/destruction 2521.8 0.6194 1562.0 177.92 12.85
monk/mistweaver 2462.7 0.6255 1540.42 156.34 11.3
mage/fire 2558.4 0.5946 1521.22 137.14 9.91
priest/shadow 2576.7 0.5885 1516.39 132.31 9.56
druid/feral 2584.1 0.5796 1497.74 113.66 8.21
shaman/elemental 2590.4 0.5593 1448.81 64.73 4.68
rogue/subtlety 2546.2 0.5631 1433.77 49.69 3.59
shaman/enhancement 2522.9 0.5678 1432.5 48.42 3.5
hunter/marksmanship 2505.1 0.5673 1421.14 37.06 2.68
evoker/preservation 2442.8 0.5766 1408.52 24.44 1.77
warrior/arms 2520.9 0.5548 1398.6 14.52 1.05
priest/holy 2404.3 0.5787 1391.37 7.29 0.53
shaman/restoration 2419.1 0.5739 1388.32 4.24 0.31
monk/windwalker 2546.6 0.5435 1384.08 0.0 0.0
druid/restoration 2450.6 0.5585 1368.66 -15.42 -1.11
warrior/fury 2489.8 0.5483 1365.16 -18.92 -1.37
paladin/retribution 2512.0 0.54 1356.48 -27.6 -1.99
demonhunter/havoc 2526.2 0.534 1348.99 -35.09 -2.54
warlock/affliction 2421.0 0.5504 1332.52 -51.56 -3.73
hunter/survival 2467.8 0.5352 1320.77 -63.31 -4.57
mage/arcane 2350.2 0.5617 1320.11 -63.97 -4.62
hunter/beastmastery 2438.4 0.5369 1309.18 -74.9 -5.41
mage/frost 2423.0 0.5398 1307.94 -76.14 -5.5
priest/discipline 2373.7 0.5492 1303.64 -80.44 -5.81
evoker/devastation 2373.1 0.5441 1291.2 -92.88 -6.71
rogue/assassination 2387.4 0.525 1253.38 -130.7 -9.44
paladin/holy 2294.4 0.5125 1175.88 -208.2 -15.04
deathknight/frost 2184.5 0.5319 1161.94 -222.14 -16.05
deathknight/unholy 2272.6 0.5088 1156.3 -227.78 -16.46

Also DH is still around the same place where it has been before. Would be now low B tier but still not a huge difference. Interesting though how UDK became even more worse and marksman is suddenly top B tier. If I take the same values like before, the tier list would become like this:

https://ibb.co/qyMD2s7

Thoughts?

I believe you don’t need ChatGPT to do somehting clever in this domain, or do you ? But in a way this is the right approach. Be carefull though, chatGPT improvises a lot sometimes, it’s like a box of chocolates, you never know what you are going to find, but people love chocolate anyways.

What you have tried to do is a multi factor analysis, on the basis of which you could derive the power of the different specs. Then the deviations would surely be used to prove god knows what. Usually there is a bit more testing and modelling warranted before the results are published though.

Objectively, you should have made some of your parameters more obvious in your model. Like here you just multiply things around and assume that their weights would be equal to 1. The reason it is important is because you also have to test your parameters and find the proper weighting for them. Maybe some parameters are irrelevant, hence they should be removed from the analysis in lieu of better ones.

How many times do these specs get killed first ? How often does that happen ?
Death is the best ever factor that you could find in survival based zero sum games like arenas. It always happens, it never lies and it always determines winners or losers.
Maybe you should look into this one…

Then if I may suggest some other areas of investigation :

  • correlation matrix of the different winrates/deathrates spec to spec
  • variance analysis with violin plots or box plots

Ideally a spec that would be really S-tier-Meta wouldn’t only win big most of the time, it would simply outmatch most other specs in terms of match-ups. Remember, Pvp is a relative strength game, you only need to be smarter or stronger than the next person you are paired with. If a spec has more favorable match-ups against the other player/spec populations then it’s a winner.

At last, if you win against most scenarios it creates a dominant strategy, but you can go even beyond. Imagine that you could do it but with less volatile games and more predictible outcomes than in most other cases… this is another advantage : why would you play something risky when you can go for the safer equivalent with similar pay-offs ? Volatility and risk is super important in any serious analysis on this.
It is basically meant to give you knowledge about the shape of your population, where do they win and how stretched is it.

There are other topics of course like the herd mentality of PvPers which creates systematically a power law distribution with only a couple of specs at the top.

Think the sample size is too small to conclude anything but that doesn’t mean to say there is nothing to highlight
Feral got slingshot to S tier from B
Multiple Alts :stuck_out_tongue:

1 Like