FedCloudPeachnote

From EGIWiki
Jump to: navigation, search
Overview For users For resource providers Infrastructure status Site-specific configuration Architecture



Federated Cloud Communities menu: Home Production use cases Under development use cases Closed use cases High level tools use cases



General Information

  • Status: Production
  • Start Date: 11/09/2012
  • End Date: In production from May 2014
  • Cloud sites: CESNET-MetaCloud, FZJ
  • Virtual Organisation: peachnote.com
  • EGI.eu contact: Gergely Sipos / gergely.sipos@egi.eu
  • External contact: Vladimir Viro / vladimir@viro.name

Short Description

Peachnote is a music score search engine and analysis platform. The system is the first of its kind and can be thought as an analog of Google Books Ngram Viewer and Google Books search for music scores. Hundreds of thousands of music scores are being digitized by libraries all over the world. In contrast to books, they generally remain inaccessible for content-based retrieval and algorithmic analysis. There is no analogue to Google Books for music scores, and there exist no large corpora of symbolic music data that would empower musicology in the way large text corpora are empowering computational linguistics, sociology, history, and other humanities that have printed word as their major source of evidence about their research subjects. We want to help change that. Peachnote provides visitors and researchers access to a massive amount of symbolic music data.

Building up the corpora of musical sheets requires significant amount of computing capacity. Millions of sheets need to be converted from JPG images to music XML, sometimes multiple times, using different configurations of the converter algorightm. Peachnote is already integrated with the Simple Queue Service (SQS) of Amazon. Amazon SQS will remain the coordinator of computation, with site(s) of the EGI Federated Cloud used for the calculations.

http://www.peachnote.com

Use Case

Use case 1 (in production)

Ability to upload and start a prepared VMware VM. The VM only has to be able to make outbound connections: to Amazon's SQS for job info, to HBase cluster to retrieve and store data, and to the peachnote server to regularly update the workflow code. No inbound connections are needed, which hopefully means less administrative and security concerns.

  1. Choose and obtain access to site(s) of the EGI Federated Cloud testbed
  2. Convert the Peachnote VM to the format expected by the EGI site(s) (It is currently in VMWare format)
  3. Set up the Peachnote VM service on the sites
  4. Document feedback and requirements for the EGI Federated Cloud Task Force
Use case 2 (cancelled)

Ability to run a small Hadoop and HBase cluster in the cloud. Being able to spin a HBase cluster using Apache Whirr on the EGI cloud infrastructure would be perfect, but a custom deployment would be a great first step.

Additional Files