Digital Library Proposal

From WEFT Wiki

Jump to: navigation, search

This proposal for building a Digital Library of WEFT's media collection was initiated by members of the Music Committee. This was a natural place of origin, because the Music Committee and its genre directors are tasked with ensuring the viability of WEFT's music collection. The project has since been spun off into a separate, independent committee (the Digital Library Committee) under the Board of Directors.

This proposal comprises a justification of the project from several points of view along with the phases of development the project will proceed through and their respective goals. This proposal is to be submitted to the Board of Directors for approval and funding. Questions about the proposal should be directed to the committee at weftdb@weft.org.

Contents

Justification

The realization of a Digital Library at the station is an exciting prospect. We can no longer afford to let it remain a hypothetical. The Digital Library will be increasingly the conduit for distribution from labels, it will be a valuable source of information for our current and prospective audience, and it will be a critical resource for our airshifters.

Labels and digital distribution

With regard to radio promotion, the music industry is rapidly transitioning to a fully digital business model. The music labels, promoters, and distributors are growing increasingly reluctant to mail us actual physical CDs in jewel cases. The trend is inexorably toward a model in which WEFT's genre directors must go to a web site, enter a login and password, and download files of whatever songs we are authorized to download. We can also generally download a .jpg file of "cover art" for that particular release, and a Microsoft Word document consisting of a "one-sheet", which is a brief description of the artist and the music.

While genre directors across the entire musical spectrum are now receiving queries about digital distribution, this trend seems to be affecting the distribution of world music the most at the present time, probably because of the cost of postage for packages sent from overseas. Already two major world music distributors have announced their intention of going fully digital in the very near future. And more are soon to follow.

Spectre is probably the largest single distributor of world music (and other types of music as well), handling six or eight major world music labels from both the U.S. and overseas (Belgium, Germany). While they are still grudgingly sending us CDs if we request them, they have already set up a parallel method of digital distribution which they are encouraging radio stations to use (with the not-so-veiled implication of penalties for non-use). Spectre has announced that its intention is for its music distribution to be fully digital by sometime in 2008.

World Music Network, based in England and one of the largest single producers of world music, produces the Rough Guide series of world music CDs and also the Riverboat label. They do their distribution through Wordisc in California. About a year and a half ago they quit mailing us jewel cases. Recently Worldisc announced that World Music Network is no longer mailing tray cards either, and that they were "experimenting" with digital distribution, starting with the Rough Guide to the Music of Congo. We never did receive a CD for that Rough Guide, and did not yet have in place a method of utilizing the .mp3 version of the music. There has been no indication that we'll be receiving any more Rough Guide CDs through the mail.

This fully digital business model has the effect of shifting not only the cost but also the burden of work away from the music labels/distributors and toward the radio stations. In some ways, at least in the short term, this new model will create quite a bit more work for WEFT's genre directors. It is hoped, though, that adequate computer hardware and properly written software will eliminate other tasks and so balance out the workload over the longer term.

In any event, we seem to have little say in the rapid movement toward digital distribution of music, and adequate planning and preparation are mandatory starting now if we are to remain viable in the 21st century as a broadcaster of quality music.

Availability and security of collection

WEFT has a very large collection of music and other audio media in many formats: CDs, vinyl, minidiscs, and even audio files on the Content Depot. The organization and security of the library is far from perfect, and it can be challenging to find a particular recording on the shelves. And of course, you have to be at the station to browse the collection. The Digital Library is the foundation for solving these problems.

The Digital Library will store not just the media files themselves, but also an arbitrarily large set of metadata about each file. This metadata will include the information you're used to seeing in an mp3 (artist, title, album, duration) as well as any additional information that could be useful to us. For example, we will track the genre that the album is filed under at WEFT in addition to its "natural" genre, which doesn't necessarily fit precisely into WEFT's genre system. An airshifter could program a show, possibly even from home, by looking up an initial album, then searching for other albums featuring the same musicians, building up a playlist without having to hunt down CDs on the shelves. A world music host could search for artists from a specific country or featuring a specific instrument.

Music disappearing from the shelves is a small but constant problem. Our jazz genre director comes to almost every committee meeting with a list of albums that have gone missing since the last meeting. Locking down access to WEFT's hard-copy music collection would surely cause more problems than it would solve. But simply archiving the music into the Digital Library will mitigate the loss, at least.

While we will not have the entire collection digitized in the first phase, it is an eventual goal. A local company, One Llama, is keenly interested in helping us gather data about our music collection. If we are able to collaborate with One Llama, they may be able to help us to significantly streamline the digitization of the library by employing an automatic CD-reading machine and their own sophisticated music analysis software.

Engaging the audience

Having easy access to data about our collection opens up many exciting opportunities for sharing information with our audience. Playlists built from the Digital Library could be automatically posted to the website. (See WWOZ's website for a wonderful example.) Charting information, now laboriously collected and collated by genre heads, could be automatically gathered, analyzed, and submitted to charting services. Dead air could be filled with music requested online by our listeners. Artists and labels could see their names coming up on our charts and decide to send us more from their catalogs. Prospective listeners from around the world could Google for an obscure artist and find that all they need to do is tune in to the stream Sunday evenings.

The possibilities are vast, and the time to grasp them is now.

Budget

Initial requirements

These are the things we need to get the project off the ground. Without them, we can't do much more than speculate. The digital library will also require two dedicated channels on the new audio console.

Digital Library Server $1,150 quote
Broadcast Studio Computer $800
Keyboard-Mouse-Video Switch $75 quote
Tripp Lite Surge Protector $100 quote ??
16-Port Gigabit Switch $200 quote
Audio Cables $50
Total $2,375
Digital Library Server
This computer is responsible for storing all of the encoded media files as well as running any services (such as databases and network access) necessary to support the library. It will be secured in the server rack and accessible over the local network, plugged into the network switch and surge protector.
Broadcast Studio Computer
This computer performs the live audio playback inside the studio. The computer will have two audio cards to allow simultaneous playback of two audio streams, to be mixed directly on the audio console. It will be plugged into the network switch surge protector and will share keyboard, mouse, and video connections with the existing ENCO computer.
Keyboard-Mouse-Video Switch
This is necessary to avoid crowding the booth. It will allow airshifters to switch the keyboard, mouse, and monitor between the ENCO and the Digital Library (and up to two other computers) without needing additional keyboards, mice, and monitors. The KVM will require a small amount of space either on top of the desk in the booth or in an easily visible and accessible location.
Tripp Lite Surge Protector
This keeps the magic smoke safely contained within the electronic doodads. The surge protector will be mounted in the server rack.
16-Port Gigabit Switch
This is the center of all wired networking at the station. All of the networked servers and computers with go through it, as will the wireless router. A gigabit switch (as opposed to the current 100 Mbit switch) will allow for much faster transfers of large media files between the various systems. The switch will be mounted in the server rack.
Audio Cables
These will connect the broadcast studio computer to the audio console. We will need two sets, one from each of the two audio cards to the two respective channels on the audio board.

Next year

These components will be important for the second phase, to begin one year from the beginning of the first phase.

Music Committee Computer $600
Smart UPS 2200VA $1000 example
Backup Server (NAS, 2TB) $2,000 example
Total $3,600
Music Committee Computer
This desktop computer will serve as a place for genre heads to digitize and process music. When not occupied with that task, it will be available to station members to puruse the digital library. The general public will not have access to copy music off of this workstation. This computer will be on a desk in the Great Hall.
Smart UPS 2200VA
This uninterruptable power supply (UPS) will allow the servers (up to six of them) to continue running for several minutes (determined by the VA rating) in the event of a power failure. If the power failure is short, this simply allows them to continue running. The "smart" feature is that the servers can be automatically triggered to shut down safely if the power outage exceeds a specified duration. The UPS will be mounted inside the server rack. It is not expected that any desktop computers will be connected to the UPS.
Backup Server (NAS, 2TB)
This server will hold backups of all mission-critical data at the station, including the data necessary to run the Digital Library. The backup server will be mounted in the server rack and will be on the local network and plugged into the UPS.

Timeline

  • hardware setup: 6 weeks
  • assurance for engineering to put PC inside broadcast either studio studio (alex)
  • assurance about rack placement being stable there for life of studio
  • software investigations will continue
  • end goal of 1st step is system that experience airshifters can enter some data to the DL and playback on air from the DL, searchable system, some type of queuing playlist, playback to listening station. (no playlist auto generaton yet, no remote access yet, etc)
  • culmination of hardware and software is 6 months

Phase 1

(full-size diagram)

Timeline: 6 months (Summer 2008)

Phase 1 proposes to create a music encoding, storage, and playback system. By the conclusion of Phase 1, WEFT will have a working computer system that provides tech-savvy airshifters with secure access to an initial set of encoded music. The music in this pool will be new music ripped by a group of willing genre directors.

There are two stages to Phase 1. The first stage will be to implement a hardware and software architecture for storage and playback of music. The initial plan is for a server at the station hosting an mp3 collection and sharing it across a password-protected, private, local network. This system, detailed below, will be designed to provide easy access to end users while preventing unauthorized copying of encoded music. As the system will be experimental at this stage, updates on progress and functionality will be provided to the Music Committee, Engineering Committee, and other specifically relevant parties.

The second stage entails populating the system with its initial collection of music. Participating genre directors will begin adding new music to the computer system as part of their regular duty of adding incoming music to the physical collection. Within four months after the initial system is built, a working set of current new music from participating genres will be available for playback from the system, leaving time to train the remaining genre directors.

Phase 1 is intended as a beta testing phase of the final, full-featured digital library system. It is not intended to be polished and presentable to airshifters at large.

Hardware requirements

The Digital Library will require a server to host media files and provide playback and database functionality; despite its grand purpose, the hardware requirements are modest. We have quotes for an adequate system in the $1000 to $1200 range.

Our best option is a 1U rackmount server from SW Technologies at just over $1000 plus shipping. (See quote) This server would be secured inside the existing server rack. The server itself will run Ubuntu Linux but will be configured to allow secure access to authorized individuals on any operating system. In particular, the computer in the Great Hall will serve as the workstation at which media is fed into the library, and we envision a listening station and collaboration with the Programming Committee on the audio production workstation. Controlling access to the Digital Library will be the Music Committee's responsibility.

The server will have two high-capacity drives, one serving as a hot-swappable backup of the primary drive. Eventually, we will need to expand the storage capacity of the Digital Library server and implement a more robust backup plan, but the details are beyond the scope of this proposal and should be part of a more general, station-wide IT strategy.

Software configuration

Backend

  1. Label drives, one as "backup" and one as "library".
  2. Format both drives as single ext3 partitions with Unicode (UTF-8) support.
  3. Install rsync to sync the backup drive to the library drive.
  4. Create a permission group and a "library" account for genre directors and DLW members to have read and write access to the library drive.
  5. Only the superuser running rsync will have write access to the backup; the library group will have read-only access to the library drive. This will allow the backup to be swapped in read-only until the live drive can be fixed.
  6. Create a directory hierarchy for the library:
/Library
  |
  +-/Audio
  |   |
  |   +-/Music
  |   |   |
  |   |   +-/<genre>
  |   |   |   |
  |   |   |   +-/<artist>
  |   |   |   |   |
  |   |   |   |   +-/<album>
  |   |   |   |
  |   |   |   ...
  |   |   |
  |   |   ...
  |   |
  |   +-/Other
  |
  +-/Other
  |
  +-/Incoming

Processing

  1. Install LAME and its GStreamer interface.
  2. Install Grip and configure it to rip into the correct directories using -V 0 --vbr-new for LAME.
  3. Install the MusicBrainz client to assist with filling out metadata.
  4. At this point, a genre head can log into the Great Hall computer, put a CD in the drive, launch the ripper, confirm and modify metadata, click rip, and have the music go directly to the library.

Playback

  1. Install mpd and configure it to index the library's audio directory. This is a command-line audio player which is easy to control remotely.
  2. Install Ampache and dependencies. This is a web-based application for searching music, building playlists, and controlling mpd.
  3. Configure Ampache to use mpd and restrict it to only be accessible on the local network or to authenticated users (genre directors).
  4. At this point, anyone with a laptop or access to a web browser on one of the studio computers can log into the system and play music from the Digital Library.

Integration issues

The Digital Library server will need to be integrated into other systems in order to be useful.

  1. The server will require one unit (1U) in the server rack. It would be wise to plan on one to two additional units for a station-wide backup server and networked storage device.
  2. The server must be on the local network but not directly accessible from outside. Data on the server will only be accessible through secured channels as moderated by the Music Committee.
  3. For playback, the library will need a dedicated channel on the boards in both studios.
  4. We will need some guarantee of the physical security of the server. At this point, we presume that the server rack will always be locked and that the Music Committee will receive a key.
  5. Some thought needs to be put into consolidating and streamlining WEFT's computer assets, though the details of such a program are beyond the scope of this proposal.

Phase 2

Timeline: 1 year from end of Phase 1 (Summer 2009)

Phase 2 builds on the foundation laid in Phase 1. All incoming music will be integrated into the Digital Library. The Music Committee will have established procedures, roles, and responsibilities for maintaining the library, ensuring its usability, integrity, and longevity.

In particular:

  • While Phase 1 does not require all genre directors to participate, all will be working with the library as Phase 2 concludes.
  • Procedures will be documented by the Music Committee, and responsibilities explicitly established. Any new positions (for example, a technical contact, a library maintainer, etc.) will be filled.
  • As all incoming music will be digitized, the library will be available to all airshifters. The exact methods used to guarantee availability and security will be finalized at the end of Phase 1.
  • Plans for enhancing the library will be developed for implementation in Phase 3. Budgets in terms of financing and volunteer hours will be developed, along with an expected timeline and milestones. Examples of future projects include:
    • establishing a DAAP server for iTunes integration
    • expanding the database to handle nonstandard metadata
    • integrating the current WEFT archives
    • computerized playlist tracking and reporting
    • developing a training plan for interested airshifters