This dataset, as it currently stands, is question-generating, not as question-answering.
Accurate.
When the data tells us something that doesn’t agree with our experience, it might be time to start questioning the reliability of those experiences.
How you choose to interpret it is up to you, but I see this as a developing tool for a community that historically has based card choices entirely on personal experience.
Less accurate.
The data is always useful. But when the data present information that is clearly unusable towards a specific goal, we have to try and reconsider what the data might be useful for. As an example, if I want to try to use the data to tell me what the best cards are, and Wasteland finishes significantly higher than Strip Mine, than the data has issues. We can use the data to determine some interesting trends, but the significance of the data gets called into question.
The data provides correlation, not causation.
Importantly, with the sample size, card pool, and playgroup considerations, card/color/theater/archetype preferences of 3-4 strong players in a weaker field can skew the data in a significant way.
When I first tried tracking information for my cube, we were trying to find maindeck % stats and win % stats. We did this for a while, but found inconsistencies in the data that prevented the data from being much more than one additional tool in a vast toolbox of them. And ultimately, the information provided by the players regarding their experiences was a more important part of card evaluation than the data we were mining. So we stopped.
The card that provides amazing value in a losing effort gets ignored. The card that provides poor (or no) value in a winning deck wielded by a skilled player gets credit for winning a draft. Without feedback from players regarding card performance, the data in and of itself doesn't carry much weight at all.
I would pick up the winning final 40 and pan through it, and ask the winner for specific feedback on cards. Like, "wow, I didn't know you were running New Card X in here, and you won the draft with it! How did it perform?!" ...And the answer was "I don't know--I never drew it". That was the end of us mining win % data. Because it simply didn't provide information about specific cards that we could rely on to provide meaningful information.
There is information that can be mined in this data. Popularity trends, drafting tendencies, player performance, etc... but I don't think that card quality determinations can be made from this data. Correlational support is great info to have, but to say that "when the data tells us something that doesn’t agree with our experience, it might be time to start questioning the reliability of those experiences" is blatantly false. Because if a player tells me card X was great in a losing effort, and another player tells me card Y was mediocre in a ridiculously good deck" ...that's data I can use.
Thank you guys so much for taking the time to extract all this data. It's fascinating stuff, and there will be good info to extract from it.
Accurate.
Less accurate.
The data is always useful. But when the data present information that is clearly unusable towards a specific goal, we have to try and reconsider what the data might be useful for. As an example, if I want to try to use the data to tell me what the best cards are, and Wasteland finishes significantly higher than Strip Mine, than the data has issues. We can use the data to determine some interesting trends, but the significance of the data gets called into question.
The data provides correlation, not causation.
Importantly, with the sample size, card pool, and playgroup considerations, card/color/theater/archetype preferences of 3-4 strong players in a weaker field can skew the data in a significant way.
When I first tried tracking information for my cube, we were trying to find maindeck % stats and win % stats. We did this for a while, but found inconsistencies in the data that prevented the data from being much more than one additional tool in a vast toolbox of them. And ultimately, the information provided by the players regarding their experiences was a more important part of card evaluation than the data we were mining. So we stopped.
The card that provides amazing value in a losing effort gets ignored. The card that provides poor (or no) value in a winning deck wielded by a skilled player gets credit for winning a draft. Without feedback from players regarding card performance, the data in and of itself doesn't carry much weight at all.
I would pick up the winning final 40 and pan through it, and ask the winner for specific feedback on cards. Like, "wow, I didn't know you were running New Card X in here, and you won the draft with it! How did it perform?!" ...And the answer was "I don't know--I never drew it". That was the end of us mining win % data. Because it simply didn't provide information about specific cards that we could rely on to provide meaningful information.
There is information that can be mined in this data. Popularity trends, drafting tendencies, player performance, etc... but I don't think that card quality determinations can be made from this data. Correlational support is great info to have, but to say that "when the data tells us something that doesn’t agree with our experience, it might be time to start questioning the reliability of those experiences" is blatantly false. Because if a player tells me card X was great in a losing effort, and another player tells me card Y was mediocre in a ridiculously good deck" ...that's data I can use.
Thank you guys so much for taking the time to extract all this data. It's fascinating stuff, and there will be good info to extract from it.
My 630 Card Powered Cube
My Article - "Cube Design Philosophy"
My Article - "Mana Short: A study in limited resource management."
My 50th Set (P)review - Discusses my top 20 Cube cards from OTJ!