Tier List based on Shuffle ladder

Lillydot-mograine · July 2, 2023, 7:57pm

Hey there.

Since I had read several threads with nerf/buffs about the same classes today and wanted to test whether ChatGPT is suitable as a programming aid, I just thought I could write a program with AI support, which evaluates the shuffle ladder and presents specs according to strength.

For this the programm analyzed the top 100 players of each spec (except tanks and outlaw rogue). To determine the strength of the specs, it multiplied the average rating with the average winratio of each spec.

Here is the result, with the last two columns showing how far they are from the median:

Spec	Avg Rating	Win Ratio	Spec Strength	D2Median	in %
druid/balance	2417.95	0.5638	1363.24	174.82	14.71
warlock/destruction	2331.3	0.5762	1343.3	154.88	13.03
monk/mistweaver	2303.31	0.5682	1308.74	120.32	10.12
shaman/elemental	2331.16	0.5471	1275.38	86.96	7.32
warlock/demonology	2305.47	0.5485	1264.55	76.13	6.41
priest/shadow	2363.35	0.5337	1261.32	72.9	6.13
rogue/subtlety	2272.57	0.548	1245.37	56.95	4.79
druid/restoration	2273.99	0.5467	1243.19	54.77	4.61
paladin/retribution	2343.4	0.529	1239.66	51.24	4.31
shaman/enhancement	2289.61	0.5412	1239.14	50.72	4.27
mage/fire	2281.27	0.5406	1233.25	44.83	3.77
priest/holy	2247.32	0.5446	1223.89	35.47	2.98
warrior/fury	2226.68	0.5474	1218.88	30.46	2.56
druid/feral	2272.37	0.5304	1205.27	16.85	1.42
warrior/arms	2266.29	0.5264	1192.98	4.56	0.38
monk/windwalker	2261.93	0.5254	1188.42	0.0	0.0
shaman/restoration	2208.09	0.5354	1182.21	-6.21	-0.52
hunter/survival	2227.94	0.5268	1173.68	-14.74	-1.24
evoker/preservation	2185.11	0.5357	1170.56	-17.86	-1.5
priest/discipline	2217.18	0.5247	1163.35	-25.07	-2.11
warlock/affliction	2183.84	0.5304	1158.31	-30.11	-2.53
demonhunter/havoc	2203.34	0.5191	1143.75	-44.67	-3.76
mage/frost	2152.52	0.5216	1122.75	-65.67	-5.53
hunter/marksmanship	2142.08	0.5225	1119.24	-69.18	-5.82
evoker/devastation	2099.72	0.5309	1114.74	-73.68	-6.2
hunter/beastmastery	2084.12	0.5268	1097.91	-90.51	-7.62
deathknight/unholy	2076.91	0.5135	1066.49	-121.93	-10.26
paladin/holy	2058.76	0.518	1066.44	-121.98	-10.26
rogue/assassination	2043.24	0.5079	1037.76	-150.66	-12.68
deathknight/frost	1933.18	0.5193	1003.9	-184.52	-15.53
mage/arcane	1905.22	0.5187	988.24	-200.18	-16.84

According to the data, WW Monk is at the median, meaning that 50% of the specs are performing better on the ladder while 50% are lagging behind.

I think that speaks for itself, but for better visualization, I’ve created a tier list based on this data which you can find here:

https://ibb.co/4NYPTYX

I’m actually surprised that the data almost perfectly matches my expectations. I would have thought that sub rogue and hunter would be higher, but I suppose the reason is due to random groups and the difficulties in setting up good setups that way. Also, I expected DH to be higher, but whatever.

The rest, however, was exactly what I expected and it clearly shows the current wizard meta and how powerful control + range abilities + magic damage (ret/enha) really are! To be fair though, we need a god tier for boomy since S tier isn’t even close enough to show its strength!!

What do you think about this kind of analysis? Do you find it helpful to see a more statistical and therefore less biased tier list? Let me know!

Siwel-ravencrest · July 2, 2023, 8:06pm

Arms, Feral and Holy Priest into the middle of A tier and it’s pretty spot on I think.

Lillydot-mograine · July 2, 2023, 8:11pm

But then it is biased again. The tier list is simply based on 3100 players, means top 100 players from each spec and how they are doing in shuffle.

Siwel-ravencrest · July 2, 2023, 8:15pm

Which is only fair if you have them playing the same amount of games into the exact same matchups. My comment is based on fighting all specs up to and at 2.5+ mmr. Those specs are better than B tier.

Slinks-silvermoon · July 2, 2023, 8:48pm

Thanks for posting that
Loads of really interesting specs in the plotted table
cries in DH

Broxis-blackrock · July 2, 2023, 8:49pm

what your analysis does not take into account is that the data is distorted because there were several buffs and nerfs to many specs so specs that were former s tier specs still have high representation even though they are not s tier anymore (destro for example).

The timeframe of the data is important when analysing the data.

dh is pretty solid and not C tier at all haha. Prevoker und Holypriest are A tier too in my opinion. Especially holy after recent buffs.

In general its a decent list though

Flylikemike-stormscale · July 2, 2023, 8:55pm

Can already bet there will be statistics gurus arguing your logic

Thanks for posting, very interesting and imo truthful and correlates with what Ive been experiencing in the game as well

Slinks-silvermoon · July 2, 2023, 8:56pm

Data says no
haha
Not that black and white but DH defensives are really questionable

Siwel-ravencrest · July 2, 2023, 8:59pm

It’s comfortably a B tier solo shuffle spec.

Slinks-silvermoon · July 2, 2023, 8:59pm

I have no idea what tier it is
I’m terrible at this game
Surprised I still have a full head of hair from Solo fiesta

Small Edit - I’d say B is probably around correct for RSS

Lillydot-mograine · July 2, 2023, 9:03pm

I thought about that too, but then I thought that the mmr hotfix from 1.5 weeks ago would take that more than enough into account. But depending on how active the players of each spec have been since then, the data could be inefficient for some specs and represent only the current snapshot of the ladder, which need some time after hotfixes until they are reflected.

However, there were no patch notes that move a spec into a completely different tier, were there?

It might be that there is a huge gap between very good DHs and just solid ones, but I took the top 100 for a reason and not the top 20 for example, were the tier list becomes differently for some specs like Arms and Feral.

Maybe its better to take a cr limit, for example every player of a spec up from duelist rating? But then how to take into account the differences of numbers? Would it be fair to compare 350 boomkins with 100 dhs for example? Any suggestions to improve it?

To be fair, havoc is 3% behind median. Just a tiny bit more winratio or higher average rating and it would have been in low B tier.

Its not too bad, but I wanted to keep the bias out and just spot it the way I placed every other spec.

Slinks-silvermoon · July 2, 2023, 9:05pm

Depending how long it takes to feed the data
Do 1800+
2100+
2400+
Would this work ?

ahh I was just memeing lol
I genuinely expected to see it higher ngl

Lillydot-mograine · July 2, 2023, 9:09pm

It would definitely work, but it will take some time to do the coding since the current one only takes the first page of each spec (top 100). Based on ratings I need to modify it quite a bit and I don’t have enough time during the week, so maybe next weekend if I’m in the mood.

I see.

Broxis-blackrock · July 2, 2023, 9:12pm

above 1800 its pretty solid but it strongly falls down above 2200+ for example. Dont really know why tbh. When i play with dh they feel insanely strong haha.

But yes the data shows that its not that good at top ratings in shuffle so propaply not a tier. Defnitely not c tier though.

destro for example is way less represented compared to a few weeks ago.
Holy Priests are way more dominant now. Both definitely moved a tier.

not sure but propaply a time limit like how the ladder looks in the last 2 weeks or something but thats not possible with the data we have i think.

Lillydot-mograine · July 2, 2023, 9:14pm

Unfortunately it is not. We can take the data again in 2 weeks and see how it moved then but I don’t have any data from 2 weeks ago, just the current ladder.

I still think the mmr hotfix would have done the work, as the average rating for the other specs should have become much higher compared to when destro got nerfed.

Slinks-silvermoon · July 2, 2023, 9:18pm

Yeah sounds quite reasonable. Above 2200 players probably know exactly how to punish DH so a small mistake = hold cheeks

When things go right I can see why people think they’re really good. Well timed Darkness and your burst dissapears. Blur dodging abilites etc
Demon proc Heal, really strong damage

My rating Idek what is going on, every game feels different with people doing random things (probably me included from bad habbits formed)

Well if you can be bothered and do end up doing it I’d be very curious. I’d understand if it never comes though haha

Leroxius-ragnaros · July 2, 2023, 9:43pm

As an arms warrior I fully agree with this list.

Lillydot-mograine · July 2, 2023, 9:53pm

I just took the top 10 player of each spec where we can somewhat be sure that the mmr hotfix will have been enough, since mmr shifted by a 150 points so far afaik and each spec should have had at least 10 players pushing in the past few days?

The result isn’t that different from what I would have expected. Boomy became even more god tier, but Destro is still S tier then. If the mmr hotfix wasn’t enough to compensate it, it must have been the only spec pre nerf that was able to reach 2500+ in average for the top10 player.

Spec	Avg Rating	Win Ratio	Spec Strength	D2Median	in %
druid/balance	2609.4	0.6272	1636.62	252.54	18.25
warlock/demonology	2527.4	0.62	1566.99	182.91	13.22
warlock/destruction	2521.8	0.6194	1562.0	177.92	12.85
monk/mistweaver	2462.7	0.6255	1540.42	156.34	11.3
mage/fire	2558.4	0.5946	1521.22	137.14	9.91
priest/shadow	2576.7	0.5885	1516.39	132.31	9.56
druid/feral	2584.1	0.5796	1497.74	113.66	8.21
shaman/elemental	2590.4	0.5593	1448.81	64.73	4.68
rogue/subtlety	2546.2	0.5631	1433.77	49.69	3.59
shaman/enhancement	2522.9	0.5678	1432.5	48.42	3.5
hunter/marksmanship	2505.1	0.5673	1421.14	37.06	2.68
evoker/preservation	2442.8	0.5766	1408.52	24.44	1.77
warrior/arms	2520.9	0.5548	1398.6	14.52	1.05
priest/holy	2404.3	0.5787	1391.37	7.29	0.53
shaman/restoration	2419.1	0.5739	1388.32	4.24	0.31
monk/windwalker	2546.6	0.5435	1384.08	0.0	0.0
druid/restoration	2450.6	0.5585	1368.66	-15.42	-1.11
warrior/fury	2489.8	0.5483	1365.16	-18.92	-1.37
paladin/retribution	2512.0	0.54	1356.48	-27.6	-1.99
demonhunter/havoc	2526.2	0.534	1348.99	-35.09	-2.54
warlock/affliction	2421.0	0.5504	1332.52	-51.56	-3.73
hunter/survival	2467.8	0.5352	1320.77	-63.31	-4.57
mage/arcane	2350.2	0.5617	1320.11	-63.97	-4.62
hunter/beastmastery	2438.4	0.5369	1309.18	-74.9	-5.41
mage/frost	2423.0	0.5398	1307.94	-76.14	-5.5
priest/discipline	2373.7	0.5492	1303.64	-80.44	-5.81
evoker/devastation	2373.1	0.5441	1291.2	-92.88	-6.71
rogue/assassination	2387.4	0.525	1253.38	-130.7	-9.44
paladin/holy	2294.4	0.5125	1175.88	-208.2	-15.04
deathknight/frost	2184.5	0.5319	1161.94	-222.14	-16.05
deathknight/unholy	2272.6	0.5088	1156.3	-227.78	-16.46

Also DH is still around the same place where it has been before. Would be now low B tier but still not a huge difference. Interesting though how UDK became even more worse and marksman is suddenly top B tier. If I take the same values like before, the tier list would become like this:

https://ibb.co/qyMD2s7

Thoughts?

Elasha-archimonde · July 2, 2023, 9:57pm

I believe you don’t need ChatGPT to do somehting clever in this domain, or do you ? But in a way this is the right approach. Be carefull though, chatGPT improvises a lot sometimes, it’s like a box of chocolates, you never know what you are going to find, but people love chocolate anyways.

What you have tried to do is a multi factor analysis, on the basis of which you could derive the power of the different specs. Then the deviations would surely be used to prove god knows what. Usually there is a bit more testing and modelling warranted before the results are published though.

Objectively, you should have made some of your parameters more obvious in your model. Like here you just multiply things around and assume that their weights would be equal to 1. The reason it is important is because you also have to test your parameters and find the proper weighting for them. Maybe some parameters are irrelevant, hence they should be removed from the analysis in lieu of better ones.

How many times do these specs get killed first ? How often does that happen ?
Death is the best ever factor that you could find in survival based zero sum games like arenas. It always happens, it never lies and it always determines winners or losers.
Maybe you should look into this one…

Then if I may suggest some other areas of investigation :

correlation matrix of the different winrates/deathrates spec to spec
variance analysis with violin plots or box plots

Ideally a spec that would be really S-tier-Meta wouldn’t only win big most of the time, it would simply outmatch most other specs in terms of match-ups. Remember, Pvp is a relative strength game, you only need to be smarter or stronger than the next person you are paired with. If a spec has more favorable match-ups against the other player/spec populations then it’s a winner.

At last, if you win against most scenarios it creates a dominant strategy, but you can go even beyond. Imagine that you could do it but with less volatile games and more predictible outcomes than in most other cases… this is another advantage : why would you play something risky when you can go for the safer equivalent with similar pay-offs ? Volatility and risk is super important in any serious analysis on this.
It is basically meant to give you knowledge about the shape of your population, where do they win and how stretched is it.

There are other topics of course like the herd mentality of PvPers which creates systematically a power law distribution with only a couple of specs at the top.

Slinks-silvermoon · July 2, 2023, 10:08pm

Think the sample size is too small to conclude anything but that doesn’t mean to say there is nothing to highlight
Feral got slingshot to S tier from B
Multiple Alts