Marty Kasprzyk Posted April 11, 2014 Report Posted April 11, 2014 I'm with Marty in principal, but I think the math is off. .5^5 is the probability of guessing 5 in a row correctly, which is much lower than guessing 5 out of 6 correctly. I peg that at 6/(2^6), or ~9.4% for a single individual. With 10 participants, if I've done the math right, there is a 64% probability that someone will get 5 out of 6 right. This only shows that more tests would be needed to get more confidence in Player 1's ability to tell old from new, and that chance alone is a good possiblity for the result. Dammed if I know. Its been 46 years since I had I course in probability. If i forget 5% every year it means I know pretty close to nothing now.
Torbjörn Zethelius Posted April 11, 2014 Report Posted April 11, 2014 All this probability math seems to indicate that they were flipping coins. Well, were they? In my mind there's a big chance that somebody actually knew what they were doing.
dan_s Posted April 11, 2014 Report Posted April 11, 2014 . There is NO evidence that any player scored any better than chance. The reason there is no evidence is not your probabilities is because they STOPPED TESTING.
dan_s Posted April 11, 2014 Report Posted April 11, 2014 All this probability math seems to indicate that they were flipping coins. Well, were they? We will never know. Because they STOPPED TESTING. That is why STATISTICS here is meaningless . It has no business here. Because they STOPPED TESTING.
dan_s Posted April 11, 2014 Report Posted April 11, 2014 Nice use of math. 0.09375 is correct, although for the wrong problem. See my previous posts. It's easier to calculate the probability of getting one wrong (6/64) plus the probability of getting zero wrong (1/64). Your previous post SUPPOSE that : THERE IS NO REASON TO BELIEVE THE ABILITY TO SORT IN NEW VS OLD IS DIFFERENT BETWEEN PLAYERS. There are many blunders in this study.
legenyes Posted April 11, 2014 Report Posted April 11, 2014 If you have not met anyone who can ace this test then you're not hangin' with the right people. It does also make me wonder about the choice of musicians in the study. Personally I can't pick old vs new but I've met people who seem to do it quite consistently. I don't think it's a common talent but neither is playing soloist quality violin. Oded The forum ate my post, but I'll try again. For what it's worth, I have ears too. And I've done quite well in those "pick the Strad" demos, but I have no confidence because I had to guess at the criteria. Overconfidence is also a well-known human trait, even among experts. It's possible that there is a systematic difference between great old violins and great new violins, and that people who get to hear them regularly up close can tell the difference. All the evidence I've seen, however, is that when the blindfold goes on, the ability to discriminate disappears. Unfortunately, the amount of evidence is rather limited, however.
Carl Stross Posted April 11, 2014 Author Report Posted April 11, 2014 We will never know. Because they STOPPED TESTING. That is why STATISTICS here is meaningless . It has no business here. Because they STOPPED TESTING. Agreed. But can you PROVE it ??
legenyes Posted April 11, 2014 Report Posted April 11, 2014 All this probability math seems to indicate that they were flipping coins. Well, were they?... No, the probability calculations don't say that none of the players could tell the difference. They just say that we can't tell from the evidence. The paper was quite clear on this point, if you read it. Remember that Carl Stross asserts that the evidence shows that two of the players could tell the difference. The math simply says that no, those results could easily be due to chance, and that there is really no significant evidence on that question from the study. If I gave any other impression, I apologize for the confusion.. Contrary to what you think, I make no assumptions about this. It's possible that more data would show that some of the players could discriminate. However, there is a problem. My qualitative impression of the data is that the players ranked the violins fairly consistently, and thus could hear the differences between them. If that's true, then I think you need just not more tests with the same players, but many more violins.
dan_s Posted April 11, 2014 Report Posted April 11, 2014 The math simply says that no, those results could easily be due to chance I think you starting to see light = "the maths" says NOTHING. Keeps quiet.
Don Noon Posted April 12, 2014 Report Posted April 12, 2014 Dammed if I know. Its been 46 years since I had I course in probability. If i forget 5% every year it means I know pretty close to nothing now. Actually, you know .95^46 = 9.4% of what you knew 46 years ago.
stefan1 Posted April 12, 2014 Report Posted April 12, 2014 No, the probability calculations don't say that none of the players could tell the difference. They just say that we can't tell from the evidence. The paper was quite clear on this point, if you read it. Remember that Carl Stross asserts that the evidence shows that two of the players could tell the difference. The math simply says that no, those results could easily be due to chance, and that there is really no significant evidence on that question from the study. If I gave any other impression, I apologize for the confusion.. Contrary to what you think, I make no assumptions about this. It's possible that more data would show that some of the players could discriminate. However, there is a problem. My qualitative impression of the data is that the players ranked the violins fairly consistently, and thus could hear the differences between them. If that's true, then I think you need just not more tests with the same players, but many more violins. You could absolutely without question here differences between the violins - as a player and as an audience member (which I was for 99 percent of the time) - however old/new -- no.
Carl Stross Posted April 12, 2014 Author Report Posted April 12, 2014 No, the probability calculations don't say that none of the players could tell the difference. They just say that we can't tell from the evidence. The paper was quite clear on this point, if you read it. Remember that Carl Stross asserts that the evidence shows that two of the players could tell the difference. The math simply says that no, those results could easily be due to chance, and that there is really no significant evidence on that question from the study. If I gave any other impression, I apologize for the confusion.. No. What the maths says is that the players AS A GROUP were unable to beat chance in identifying New vs Old. This should clarify it : "Considering all guesses about all instruments, 33 were wrong, 31 right, and 5 indeterminate. These guesses were rather evenly divided between old and new violins (36 and 33 respectively - see Table 2), so the data rather clearly demonstrate the inability of the players to reliably guess an instrument‟s age, whether the instrument is in fact new or old." But I am not interested in that because if a SINGLE player can guess right ( even better than average ) then the conclusion of the study fails : "Soloists failed to distinguish new from old at better than chance levels" But it's "soloists" and not "soloists as a group" That's why I am now entitled to look at a single player (1 ) and consider that he was tested 6 times and got 5 answers right.
Don Noon Posted April 12, 2014 Report Posted April 12, 2014 That's why I am now entitled to look at a single player (1 ) and consider that he was tested 6 times and got 5 answers right. According to your logic, if you had 10 people flip coins 6 times, and one of them got heads 5 times, you would then proclaim him skilled at obtaining heads? THERE ISN"T ENOUGH INFORMATION TO SAY MUCH OF ANYTHING.
Carl Stross Posted April 12, 2014 Author Report Posted April 12, 2014 According to your logic, if you had 10 people flip coins 6 times, and one of them got heads 5 times, you would then proclaim him skilled at obtaining heads? THERE ISN"T ENOUGH INFORMATION TO SAY MUCH OF ANYTHING. But the first player is not flipping coins - he's using his expertise to identify a difference. This is not a coin flipping study.
Janito Posted April 12, 2014 Report Posted April 12, 2014 Sometimes it is most fruitful to spend time analysing the extremes - those who did very well and those who did very badly. The ones in the middle just add noise. And let's not forget repeatability that defies regression to the mean.
Jim Bress Posted April 12, 2014 Report Posted April 12, 2014 The 'catch' hinges on the idea that anonymity is required for objectivity. That's why the 'gold standard' for scientific testing is a 'double blind', placebo controlled repeatable trial. That is, that neither the tester or the tested know the identity of the object being tested. Once the listener can identify what they are hearing then whatever (unconscious) prejudice they might have would determine their choice. Speaking of scientific protocol, what is the control in this experiment? (sorry haven't gotten around to reading it yet) Oded Sadly, no control and no statistics other than descriptive statistics. I posted a pdf of the article in the other thread. It is really a shame that with the considerable time, effort, and money (presumably) put into this project that the authors did not design the methods in a manner that would allow for statistical analyses. With descriptive statistics you really can't separate causal results from variance and random chance. I haven't quite finished the article yet because my time is short and I need sleep. I'll finish it sometime tomorrow. Good night, Jim
JohnCockburn Posted April 12, 2014 Report Posted April 12, 2014 According to your logic, if you had 10 people flip coins 6 times, and one of them got heads 5 times, you would then proclaim him skilled at obtaining heads? THERE ISN"T ENOUGH INFORMATION TO SAY MUCH OF ANYTHING. I think this argument would make perfect sense if the study were equivalent to 10 people each tossing an unbiased coin 6 times. But surely, a better analogy to the study would be inviting 10 people to each bring their own coin, each of which may or may not be biased. The bias of course represents the degree of skill the player has in choosing new vs old. The object of the study is then to determine whether any of the coins are biased (can any of these chumps really tell new from old?). If you are sure that none of the coins are biased than 5 out of 6 heads by one tosser is of no interest. But if the object of the study is to investigate whether or not any of the coins are biased, surely you need to carry out further tests on Mr 5/6's quarter before you reach a negative conclusion? As it stands, though you're correct. There isn't enough information to say much of anything.
legenyes Posted April 12, 2014 Report Posted April 12, 2014 I think you starting to see light = "the maths" says NOTHING. Keeps quiet. Starting to see the light? Here is my first statement: ""Fritz and others are correct. There is NO evidence that any player scored any better than chance." I think that's clear enough. I realize that English is not your first language. Did I answer your questions? --the questions in capital letters?
Violadamore Posted April 12, 2014 Report Posted April 12, 2014 There is NO evidence that any player scored any better than chance. Since we're discussing probability and the frontiers of Science............. [shuffles and cuts a deck of Rhine cards modified to have photos of violins on them, and grins very evilly indeed]
Don Noon Posted April 12, 2014 Report Posted April 12, 2014 But the first player is not flipping coins - he's using his expertise to identify a difference. That could be the case. But you can not KNOW that his expertise caused the result unless you are God... or unless more tests are done.
lFred Posted April 12, 2014 Report Posted April 12, 2014 The 2014 Paper aimed to confirm and further investigate conclusion made by the authors in 2010. It claim to asses some of the short comings of the initial experiment. The investigation of the Evolution of preferences from rehearsal room to concert hall was the first thing they wanted to study. The authors conclude that according to there finding "...that meaningful testing about general preferences is possible outside a concert hall .../.. There is certainly no evidence here to support the belief that Old Italian violins come into their own in concert halls, while new ones fall behind." Be that as it may, I chose to have a look at the actual data. Initial impression: the first thing that stroke me is that the numbers did not add up in Figure 1. Session 1 gives us 99 Choices but 12 violins 10 players means 120 choice. the explanation is no where to be find in the paper, Only when I toke a look at the raw data in Table S1 could I figure that there was some "intermediate violin" . Again no explanation of what that are or how they are taken into account or not . You have to guess that there were some instruments that were not discarded yet not in top 4 and that the author chose not to discuss those data from those violins as they were given a transparent 0 value... That's 21 out of 120 data inputs ie 19% of the value made transparent. Global data: Global data gives us: session 1| 2 Rejection Total: 59 | 61 Rejection New: 26 | 27 Rejection Ita-18: 33 | 34 Positive Total: 40 |40 Positive New: 24 | 24 Positive Ital-18: 16 | 16 Those number are consistent with the authors conclusions: a small room seems good enough to test instruments, there is no difference between New instruments and 18th century Italians. Lets take an other look at table 1: As far as the preferred instrument (the one the player would keep to play) is concerned For 18th century Italians there is a 400% increase in choice between session 1 and Session 2 there is a 200% increase in number of instruments chosen between session 1 and Session 2 For new instruments there is a 33% decrease in choise between session 1 and Session 2 there is a 40% decrease of number of chose instruments between session 1 and Session 2 The number of discarded instrument is stable. Lets take a look a the "transparent data" in order to take a look at the transparent data I took the raw data in Table S1 and for each instrument each time between session it gain a rank I assigned it +1, each time he lost a rank I assigned -1 , it stayed in position it would get 0.For exemple If violin O1 was transparent and not taken in account in session 1 with player 9 it gain +4 on session 2 , it also allow to track with a bit more details what's going between sessions rather that just get the global value even with the rejected. those numbers tells us that: For for 18th century Italians the maximum amplitude in choice between Session 1 and Session 2 is 10 points (01 is +6 O3 and 06 are -4) For new instruments the maximum amplitude is 19 points (N5 is +11 N7 is -8) The global gain across all 18th Italians from S1 to S2 is +1 The global gain across all new instruments from S1 to S2 is +10 What do we make of those numbers? If I only took a look at the table 1 data (the one analysed be the authors) I would have concluded that there are evidence that you can not properly test a violin in an hotel room an that large halls were in deed more beneficial to 18th Italian instruments than modern instruments. In other word the exact opposite of what the authors concluded. What do the transparent data tells us that the global data did not ? If they concurred that venue change the way the player like or not an instruments. They also seems to indicate that despite the fact that Old instruments seems be more are there advantage in large hall, the changes in venues modify with a greater extend the way the players judges an instruments.(that was to be expected as session 2 allows new experiences for the player not just new venue). In my opinion one could consider this as a design flaw as there is no way to judge if the score and the criteria changes are due to the change in venue or the change is listening perspective) Raw data of the second Part 2 the experiment (Evaluation by specific criteria) is not available, moreover protocol has in my opinion majors flaws.I therefore discarded it as I saw no point to discuss it.
dan_s Posted April 12, 2014 Report Posted April 12, 2014 Starting to see the light? Here is my first statement: ""Fritz and others are correct. There is NO evidence that any player scored any better than chance." I think that's clear enough. I realize that English is not your first language. Did I answer your questions? --the questions in capital letters? It is clear - you do not understand the issue. YOU MUST NOT LOOK AT THE PLAYERS AS A GROUP. Just keep reading Carl's first post until the penny drops.
dan_s Posted April 12, 2014 Report Posted April 12, 2014 That could be the case. But you can not KNOW that his expertise caused the result unless you are God... or unless more tests are done. More tests will increase the confidence. We are already confident. He was tested 6 times and got 5 right. He is NOT flipping coins. We are not doing a flipping coin test. He's biased coin ( his expertise ) has already been established. More tests will INCREASE our confidence in his expertise. Get it ???? YOU MUST NOT LOOK AT THE PLAYERS AS A GROUP. YOU MUST LOOK AT THE PLAYERS INDIVIDUALLY. LOOKING AT THE PLAYERS AS A GROUP DOES NOT MAKE ANY SENSE.
Marty Kasprzyk Posted April 12, 2014 Report Posted April 12, 2014 Actually, you know .95^46 = 9.4% of what you knew 46 years ago. If I forgot 5% every year for 46 years didn't I forget 230% of what I used to know?
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now