Jump to content
Maestronet Forums

Created a database of the violin trade. Now what should I do with it?


Alex Chandler
 Share

Recommended Posts

I have a database of around 10,000 instruments containing maker name, instrument type, location made, year built, year sold, price sold, and photos of the instrument. I have so far calculated the inflation rate for instruments of different type, value, and age from auction data of instruments that have been publicly sold multiple times. My interests are twofold: one is to integrate statistics into the history of the violin trade. The other is to apply to a branch of computer science to predict the price of instruments based on only a photo of the instrument. The process of creating such a database was excruciating. I may release it later after I am done testing it for its accuracy. My unfinished website can be viewed here: instrumentappraiser.com . The end website should predict the quality of an instrument solely based off photos. It will never be perfect, but could serve as a spam identifier or have some other unknown purpose.

A couple interesting graphs and charts can be viewed here.

 https://violinsarecool.wordpress.com/home/history-of-violin-trade/

https://violinsarecool.wordpress.com/home/violin-inflation-statistics/

My questions are the following: if one was to have a database of the entire violin trade, what would one do with it? What are unanswered questions in the violin trade that could be solved with data? Does anyone else have interest in helping me in this project. Attached below is a compressed file of what I have accomplished so far.

snippets.zip

Link to comment
Share on other sites

  • Replies 52
  • Created
  • Last Reply

Top Posters In This Topic

13 hours ago, Alex Chandler said:

The other is to apply to a branch of computer science to predict the price of instruments based on only a photo of the instrumentssnippets.zip

It sounds like a fascinating project, but (pace David Burgess) before letting the sceptics loose on it you'll obviously want to try it out intensively for yourself to see if your methodology has any predictive power at all. This was the line that brought me up short. A kind of facial recognition software for violins, not just to identify the maker but the price??? My gut feeling would be "not a chance". 

Link to comment
Share on other sites

22 minutes ago, matesic said:

It sounds like a fascinating project, but (pace David Burgess) before letting the sceptics loose on it you'll obviously want to try it out intensively for yourself to see if your methodology has any predictive power at all. This was the line that brought me up short. A kind of facial recognition software for violins, not just to identify the maker but the price??? My gut feeling would be "not a chance". 

I hear you and share your skepticism in such a model working. In my opinion, artificial intelligence will never do at good a job as predicting the price Of instrument than a trained lutier, but for it to work for remotely is pretty cool. The algorithm is a kind of a deep convolution of neural network, but I am not fully satisfied with it so I am trying different models. For anyone with tech knowledge, the algorithms are VG16 and xCeption. Similar architecture’s are used to categorize faces, although the power of similar architectures far exceeds facial recognition. Currently, my model works well at distinguishing a shitty instrument from a good instrument, but it does not do a good job of distinguishing between a good instrument and a great instrument. I think the more useful part of my project is inflation data for different types of instruments. I am an amateur cellist, and have long been interested in how instruments appreciate over time.

Link to comment
Share on other sites

I had a look at the site, but I'm struggling to make sense of it.

It seems like a neat idea, just a picture can tell you everything you need to know in an instant. In practice, I see this as heavily flawed.

A Storioni might be worth $500,000. The materials and workmanship could leave something to be desired.
A modern copy of the same Storioni might have better materials and workmanship, but still quite faithful in the details and antiquing, it could cost $30,000 or less.

Can you really deduce that from one picture of each, and accurately reflect the pricing? It seems very unlikely to me that this would ever be possible.

Link to comment
Share on other sites

1 minute ago, Wood Butcher said:

I had a look at the site, but I'm struggling to make sense of it.

It seems like a neat idea, just a picture can tell you everything you need to know in an instant. In practice, I see this as heavily flawed.

A Storioni might be worth $500,000. The materials and workmanship could leave something to be desired.
A modern copy of the same Storioni might have better materials and workmanship, but still quite faithful in the details and antiquing, it could cost $30,000 or less.

Can you really deduce that from one picture of each, and accurately reflect the pricing? It seems very unlikely to me that this would ever be possible.

The website is unfinished, full of typos, I am sure, and still changing.. But yes I should say the tool is rather imperfect in its nature. The same could be said of some cellos in the Testore school that sell of obscene amounts but look crappy. I could boost accuracy by create a price prediction model including multiple inputs like maker name and age, like seen here for real estate (https://www.pyimagesearch.com/2019/02/04/keras-multiple-inputs-and-mixed-data/). I am more interested in what one could do with a database of this size. I have made an interactive map where one can pick a range of years and see where the instruments of that time are on a map of a country or geographic region.

Link to comment
Share on other sites

2 hours ago, David Burgess said:

What should you do with it? You might want to give it a Beta test first, by putting it up here, and seeing what people have to say about it. There are some pretty good experts here, whose feedback might save you some grief down the road.

Good idea. What kind of format for the database would be the most accessible to a wide audience? Perhaps, a CSV file? 

Link to comment
Share on other sites

14 hours ago, Alex Chandler said:

The end website should predict the quality of an instrument solely based off photos. It will never be perfect, but could serve as a spam identifier or have some other unknown purpose.

You'll have a hard time identifying particular makers.  one reason is the lack of detail in available photos, particularity ones you'll get casually submitted to evaluate.  It would be less of a problem to identify a particular violin, using various recognition techniques.  Would be high confidence and quite easy in fact with photos that showed grain well.  

If you wanted to evaluate say using contest evaluation criteria, photo quality would still be a problem.  In hand, a judge would prob. want to zoom in on places with a loupe, for example.  I think it's worth pursuing scientifically, maybe too early to have this application in mind

I've been known to geek out my own self and someone wanted me to write a phone app that would identify and price antiques, something that you could use while antique shopping instead of constantly referring to ebay for price.  To eliminate that human step iow.  It's similar to your problem and just in the discussion about it i realized it was too big of a problem.  some infrastructure piece that allowed it to be less of a problem might be worthwhile.  you can take a picture of something and have it looked up on amazon, but the trick there is amazon is finite -- if amazon doesn't carry the item it doesn't mean your app failed...

Link to comment
Share on other sites

Two things: a former colleague of mine who has gone on to the higher end of the violin trade once said "Violins are the most deceptive things in the world to deal with because they are all made to look like something that they're not." And what I find regarding artificial intelligence is that it's more artificial than intelligent.

Link to comment
Share on other sites

On 8/5/2020 at 7:32 PM, AlexChandler said:

Currently, my model works well at distinguishing a shitty instrument from a good instrument, but it does not do a good job of distinguishing between a good instrument and a great instrument.

I'm struggling to understand how your recognition algorithm dissociates build quality from condition when it comes to deciding what is good and what is shitty; would it really prefer a Strad that's been knocked around a bit to a well-preserved Markie? I think the reason it doesn't do a good job distinguishing between higher-order instruments is that greatness isn't a material quality but an ill-defined cultural label.

Link to comment
Share on other sites

2 hours ago, matesic said:

I'm struggling to understand how your recognition algorithm dissociates build quality from condition when it comes to deciding what is good and what is shitty; would it really prefer a Strad that's been knocked around a bit to a well-preserved Markie? I think the reason it doesn't do a good job distinguishing between higher-order instruments is that greatness isn't a material quality but an ill-defined cultural label.

One reason I haven't sent the code out is because I haven't tested it for things like that. Maybe there is a high correlation between black and white photos and high quality instruments (leading to me mistakenly thinking my algorithm works). Perhaps my algorithm is only looking at varnish and not any damage like a sound post crack. Skepticism in this part of my project is fully justified. If it does not work, well I will have a little toy to show people like that Silicon Valley clip above. I still think the bigger part of my project is inflation statistics. I can't seem to figure out why most violas appreciate less (aside from it being a limited dataset), as seen here: https://violinsarecool.wordpress.com/home/violin-inflation-statistics/

Link to comment
Share on other sites

As someone who has done quite a bit of work in AI and statisitical pattern recognition, I can say that the greatest challenge nowadays is not the algorithms, but the availability, quantity, and quality of the training and testing data.

Training an algorithm to recognize vintage violins (and do price estimates)  similar to an expert human being is going to be impossible because there is not enough reliable data to train the system. For example, when an expert looks at a violin, he/she looks at it inside and outside from different angles, perspectives, and measurements. Collecting this data into a database in a usable systematic way is essentially impossible, but it would be absolutely necessay for such a system to work. 

Link to comment
Share on other sites

You might consider an AI version of Donald Cohen's Red Book, presenting decades of auction results. (Or maybe the Red Book data provided the fodder for your calculations?)  Also of value would be publishing something like Roy Ehrhart's Violin Identification and Price Guide (Volumes 1-3), which presented retail prices from shop catalogs, with the prices updated based on inflation up to his date of printing.  Wonderful information to help one sort through confusing pricing, especially if one is handicapped by the belief that there should be "one" price for the work from any particular maker.  My impression is that both the Red Book and the Price Guide represent labors of love and lots of it (ie, labor) -- not something to get rich from.  Good luck!

Link to comment
Share on other sites

3 hours ago, GeorgeH said:

As someone who has done quite a bit of work in AI and statisitical pattern recognition, I can say that the greatest challenge nowadays is not the algorithms, but the availability, quantity, and quality of the training and testing data.

Training an algorithm to recognize vintage violins (and do price estimates)  similar to an expert human being is going to be impossible because there is not enough reliable data to train the system. For example, when an expert looks at a violin, he/she looks at it inside and outside from different angles, perspectives, and measurements. Collecting this data into a database in a usable systematic way is essentially impossible, but it would be absolutely necessay for such a system to work. 

It’s impossible for it to fully work. But I have 2,000 unique makers, 15,000 instruments, 40,000 plus photos now all labeled to price. With an image generator, I have converted those photos into a 100,000 plus photos for the convolutional neural network. It’s not anywhere close to being perfect. The more epochs, the higher “accuracy” but also the higher overfitting. It should be able to tell garbage from anything not garbage. But yes you are right. It’s not meant to replace a Luthier because it can’t replace one. I see its purpose for someone who is not at all an expert. Years ago when I first looked at buying a cello, I found a nice cello on Ebay made 17th century by "Hannibal". I knew very little about the violin market, and thought that I had found a diamond in the rough. Only after reverse image searching it did I realize this was 100% a scam. My algorithm, once complete, could serve a spam identification service for stringed instruments. When another “fine Italian Cello 17th century” is listed at 2,000 dollars on Ebay, a model with multiple inputs (text and photos) could flag this.

Link to comment
Share on other sites

2 hours ago, Richf said:

You might consider an AI version of Donald Cohen's Red Book, presenting decades of auction results. (Or maybe the Red Book data provided the fodder for your calculations?)  Also of value would be publishing something like Roy Ehrhart's Violin Identification and Price Guide (Volumes 1-3), which presented retail prices from shop catalogs, with the prices updated based on inflation up to his date of printing.  Wonderful information to help one sort through confusing pricing, especially if one is handicapped by the belief that there should be "one" price for the work from any particular maker.  My impression is that both the Red Book and the Price Guide represent labors of love and lots of it (ie, labor) -- not something to get rich from.  Good luck!

Very cool book, and thank you for letting me know about these sources of information. I may try to do so later in my life. Unfortunately, the summer is running out of days and writing any code to convert a book into a database would take 100+ hours and be an absolute pain in the ass. Big kudos though to Donald Cohen for writing it. I can appreciate the time it took to create such a database. I think you pretty much summed it up perfectly at the end. The source of appreciation for such a database will not be money. 

Link to comment
Share on other sites

1 hour ago, Alex Chandler said:

It should be able to tell garbage from anything not garbage.

I guess it depends on your definition of "garbage."

I think anything listed as a “fine Italian Cello 17th century” on eBay has a probability of about 99.999% of being not being a “fine Italian Cello 17th century," and one does not need an AI program to determine that. However, that does not mean the cello is "garbage."

I would posit a better and easier approach would be to scrape eBay listings and use purely a text analysis to predict if a listing is for "garbage" or not. You would include all the text - description, price, condition statement, seller, etc. I bet it would make much more accurate predictions than predictions based on picture analyses.

 

Link to comment
Share on other sites

for a sanity check you need to do this successfully with something much, much easier to identify and assign a price to, like cars.  don't cheat and look at their badges...  at a certain stage it's easy to glaze over and ignore  and insulate yourself from problems that make what you want to do impossible

Link to comment
Share on other sites

42 minutes ago, Bill Merkel said:

for a sanity check you need to do this successfully with something much, much easier to identify and assign a price to, like cars.  don't cheat and look at their badges...  at a certain stage it's easy to glaze over and ignore  and insulate yourself from problems that make what you want to do impossible

One thing that is actually super easy to make is a stringed instrument classifier. There are hundreds of public databases for car and cat photos, and I would label each such point as not stringed instrument. That could avoid labeling toilet paper or a rubber duck as a expensive instrument.

Link to comment
Share on other sites

On 8/4/2020 at 11:48 PM, Alex Chandler said:

I have a database of around 10,000 instruments containing maker name, instrument type, location made, year built, year sold, price sold, and photos of the instrument. I have so far calculated the inflation rate for instruments of different type, value, and age from auction data of instruments that have been publicly sold multiple times. My interests are twofold: one is to integrate statistics into the history of the violin trade. The other is to apply to a branch of computer science to predict the price of instruments based on only a photo of the instrument. The process of creating such a database was excruciating.

 

Compiling and maintaining the database sounds like more than enough of a task.  In all candor, predicting the price of instruments from photos sounds like a fool's errand.  Good luck.

Link to comment
Share on other sites

Not to be rude, but this seems about as likely to work as training a neural network to predict criminal behavior based on the shape of people's skulls. You're looking to derive a meaningful model from inputs that are not predictive of the outputs beyond whatever accidental relationships may be found in your training data.

Link to comment
Share on other sites

1 hour ago, Adrian Lopez said:

Not to be rude, but this seems about as likely to work as training a neural network to predict criminal behavior based on the shape of people's skulls. You're looking to derive a meaningful model from inputs that are not predictive of the outputs beyond whatever accidental relationships may be found in your training data.

The quality of how an instrument looks is a predictive measure for the value of an instrument. It is one of many measures, like maker name, but I suggest you learn more about Convolutional neural networks with multiple inputs before comparing them to 19th century phrenology.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share

  • Recently Browsing   0 members

    No registered users viewing this page.




×
×
  • Create New...