|
|
|
- Recommended digital recording
settings and file formats
- Preservation agreements
- Guidelines for naming files
- Copying files and labeling media for preservation and access
- Records management
- Preservation guides
|
The most important factor determining
the quality of a recording, whether analog or digital,
is the quality of the recorder's pre-amp, the device
that converts vibrations into an electronic audio signal.
There is ongoing debate over the merits of various
digital file formats, such as .mp3, .wav and .bwf,
as well as various recording parameters that determine
file size and sound quality. In practical terms, in
order to achieve the highest possible sound quality
and the maximum range of potential uses (including
the ability to take advantage of as-yet uninvented
technologies), one must invest more time and money
in purchasing and learning to use higher-end equipment
as well as acquiring more disk or server storage capacity.
Given the real-world limitations of most oral history
programs, the most widely agreed-upon compromise is
CD-quality audio recorded in a .wav format at a bit-depth
of 16 bits and a sampling rate of 44.1 kHz. This is
the standard that the music industry uses, and it also
drives the market for most mass produced consumer audio
devices.
Digital archivists, professional sound engineers,
and other digital media experts make a strong case
for adopting a higher standard if the recording is
worth preserving for the future. The larger the bit-depth and the higher the sampling rate, the more detail can be captured in the recording and the closer it comes to being an exact replica of the original sound, which also results in larger file sizes (and thus increases the required amount and cost of storage spaceāto
see how much larger, click
here.
Another reason for a higher bit-depth and sampling
rate is forgivability: if the recording levels were
too low or there was unwanted background noise and
the recording requires sound editing, a smaller,
lower-resolution file is harder to improve than a
larger, higher-resolution one. See
Sound Devices: Real World Advantages of 24-bit Recording."
Audio files should ideally be created in an uncompressed
PCM (Pulse Code Modulation) format, either .wav or
.bwf. On the other hand, as the recording capability
and storage capacity of devices such as iPods continues
to improve, including the ability to record at higher
bit-depth and sampling rates, inexpensive recording
in .mp3 format may become less incompatible with
good sound quality than currently is the case. Recording
with iPods is a particularly good option for classroom
oral history projects. Refer to the USF Oral History
Program's guide, "Recording with iPods."
PCM formats: WAV, BWF
and AIFF
In the United States, the .wav
format is the most universally accepted file format
for digital audio master files. FCLA’s WAVE
Action Plan notes that “the BWF (Broadcast
Wave Format),
developed by the European Broadcasting Union, is
based on the WAVE format. While a WAVE file can use
any one of a number of compression formats, BWF files
can only use either PCM format or MPEG compression.
A BWF file has at least one extra chunk that a WAVE
file does not - the Broadcast Audio Extension chunk.
This chunk contains material that broadcasters would
exchange with each other, like a textual description
of the sound file. The EBU came up with a core metadata
set for radio archives that is based on the Dublin
Core metadata. This metadata can be stored in the <axml> chunk
that was added to the BWF specification in 2003.
A BWF is compliant with the WAVE format and uses
the WAVE file extension (wav) so WAVE players can
play BWF files (but can not parse any added metadata).” Some
audio manufacturers, such as Sound Devices, are merging
the .wav and .bwf formats because “the .bwf
file extension ended up causing more confusion than
it eliminated” (Sound
Devices Sound Notes).
Some oral history programs also use the AIFF format.
The most important factor from a preservation standpoint
is that digital audio be recorded in an uncompressed
pulse-coded modulation (PCM) format.
The national standard for digital file formats
can be found at Sustainability
of Digital Formats: Planning for Library of Congress
Collections. Similarly, the Florida Center
for Library Automation (FCLA) has developed preservation
guidelines for the Florida
Digital Archive (FDA) , including estimates
of the long-term viability for various digital
formats. Formats that are rated high or medium
confidence level receive full preservation under
the agreements between FCLA and FDA participating
institutions among Florida's eleven state universities.
Formats rated low confidence level receive bit-level
preservation. In general, files that are encrypted,
compressed, lossy, or proprietary formats are the
most difficult to preserve. For optimum preservation,
files in these low confidence level formats should
be converted to formats that are unencrypted, uncompressed,
lossless, and open-source.
For preparing the media types typically used in
oral history (text,audio and video) for professional
digital preservation, FCLA recommends:
Text: convert
documents created with word processing programs
such as Microsoft Word or Word Perfect to PDF/A-1(ISO
19005-1)(*.pdf)
Audio: record in or convert to WAV(PCM)
(*.wav, *.bwf)
Video: record in or convert to Motion
JPEG (*.avi, *.mov), Motion JPEG 2000 (ISO/IEC
15444-4) (*.mj2), AVI(uncompressed) (*.avi), QuickTime
Movie(uncompressed)(*.mov)
High confidence level:
- Plain text(encoding: USASCII, UTF-8, UTF-16
with BOM)
- XML(includes XSD/XSL/XHTML, etc.; with
included or accessible schema and character
encoding explicitly specified)
- PDF/A-1(ISO 19005-1)(*.pdf)
Medium confidence level:
- Cascading Style Sheets(*.css)
- DTD(*.dtd)
- Plain text(ISO8859-1 encoding)
- PDF(*.pdf)(embedded fonts)
- Rich Text Format 1.x(*.rtf)
- HTML 4.x(include a DOCTYPE declaration)
- SGML(*.sgml)
- Open Office(*.sxw, *.odt)
- Office Open XML(*.docx)
Low confidence level:
- PDF(*.pdf)(encrypted)
- Microsoft Word(*.doc)
- WordPerfect(*.wpd)
- DVI(*.dvi)
- All other text formats not listed here
|
High confidence level:
- AIFF(PCM)(*.aif, *.aiff)
- WAV(PCM)(*.wav, *bwf)
Medium confidence level:
- SUN Audio(uncompressed)(*.au)
- Standard MIDI(*.mid, *midi)
- Ogg Vorbis(*.mid, *.midi)
- Free Lossless Audio Codec(*.flac)
- Advance Audio Coding(*.mp4, *.m4a, *.aac)
- MP3(MPEG-1/2, Layer 3)(*.mp3)
Low confidence level:
- AIFC(compressed)(*.aifc)
- NeXT SND(*.snd)
- RealNetworks 'Real Audio'(*.ra, *.rm, *.ram)
- Windows Media Audio(*.wma)
- WAV(compressed)(*.wav)
- All other audio formats not listed here
|
High confidence level:
- Motion JPEG 2000(ISO/IEC 15444-4), (*mj2)
- AVI(uncompressed)(*.avi)
- QuickTime Movie(uncompressed)(*.mov)
- Motion JPEG(*.avi, *.mov)
Medium confidence level:
- Ogg Theora(*.ogg)
- MPEG-1,MPEG-2(*.mpg, *.mpeg)
- MPEG-4(*.mp4)
Low confidence level:
- AVI(compressed)(*.avi)
- QuickTime Movie(compressed)(*.mov)
- RealNetworks 'Real Video'(*.rv)
- Windows Media Video(*.wmv)
- All other video formats not listed here
|
|
Ideally, a digital oral history
program should be supported by an agreement
with a library or archives to ensure professional standards
of cataloging, preservation and public access for all
interview materials. The preservation agreement should
include secure storage, verification of file fixity,
media migration as necessary, and format migration
to newer formats as file formats threaten to become
obsolete.
Click on the links below for additional
resources on preservation agreements between oral
history programs and archival repositories, including
sample agreements:
|
Once digital audio files have been
uploaded from the recorder to a computer hard drive,
it is very important to name all the files generated
by each interview using a consistent system that will
create unique ID numbers and ensure that basic information
about the provenance of the interview remains attached
to the file. According to the Digital Library Federation, "File
naming should follow ISO 9660 conventions: 8-character
filenames, 3-character extensions, using A-Z, a-z,
0-9, underscores and hyphens. The rationale behind
this suggestion is that when moving texts across different
platforms (DOS for instance), some systems will truncate
beyond the eighth character." (Digital Library
Federation, "TEI
Text Encoding in Libraries: Guidelines for Best Encoding
Practices" Version 2.1)
That being said, with only 8 characters in a file
name, it is difficult to include basic information
such as the interviewee name and date. Many oral
history programs do not adhere to 8 characters, but
use the accession number of the interview as the
basis for each unique file number, so that transcripts,
master audio, and derived files (such as sound edited
or streaming web audio) can be identified as being
from the same interview, and can be matched with
paper documentation on file. An example of a file
name based on the accession number, last name of
interviewee, date of interview (YYYYMMDD), and record
type would be: 00137_smith_20071203_trans. For
audio that requires editing to improve sound quality,
both the original and edited files should be saved
in .wav format and archived for digital preservation.
Since most master files should not require sound
editing unless there are problems during recording,
including edited versions of audio master files should
not significantly add to storage space requirements.
Be sure to specify the inclusion of both edited and
original master files in the preservation agreement
with an archival repository, if applicable. |
When using CD burning software to
save sound files to CD, there are two disk format options
that can be chosen: CD-DA (audio format) and CD-R (data
format). Note that the same compact disks (media) are
used for both CD-DA and CD-R; only the format of the
recorded file is different.
CD-DA (Compact Disc - Digital Audio) is the official
designation for the audio-only format on CD. Audio
CDs can be read by CD players as well as computers.
An audio CD can store up to 74 minutes and 30 seconds
of sound, so longer interviews will have to be divided
into two parts. If the file is saved in a data format
(CD-R), the files can only be read by a computer,
not a CD player. Since data CDs can hold 650MB, they
can hold twice the recording time as an audio formatted
CD, since at recording settings for CD-quality audio
in mono, one hour of recording time creates a file
of approximately 300MB. Archival quality DVDs can
hold up to 4.7GB of data, but some digital archivists
have expressed concern that DVDs are a less stable
format than CDs, since a different chemical process
is used to bind the layers of the DVD. External hard
drives, available in a range of storage capacities,
are also a good solution, especially for individuals
or programs who do not have preservation support
from an institutional repository. An external drive
connects to a PC via USB or FireWire and usually
comes packaged with back-up software that facilitates
the daily creation of backup copies manually or automatically.
An advantage to having all the files for a project
on one external drive is that it makes transporting
and downloading large amounts of data easier than
having to burn multiple CDs. It is still advisable,
however, that any oral history program that goes
digital should pursue a repository relationship with
an archive that can provide professional preservation
and support, since the ideal method of backing up
digital files is on a dedicated server with digital
preservation protocols in place.
Following these steps will ensure you have adequate
copies for use and preservation:
- Upload master audio files (WAVE or Broadcast
WAVE format) to a computer hard drive or server.
If recorded on a compact flash card, insert the
card into the appropriate computer port or use
a flash card reader if your computer is older and
doesn't have the right port. You can also connect
the recorder to the computer USB port with an I/O
cable. Name the files as described above.
- Burn two preservation copies (1 audio CD-DA
and 1 data CD-R) of the master file in .wav format
to MAM-A (Mitsui) gold archival discs. For non-preservation
purposes such as listening, transcription and distribution
to interviewees, use CD burning or sound editing
software to derive .mp3s and write to cheaper,
non-archival audio CDs. Keep one preservation audio
CD and one use audio CD onsite. If you have an
agreement with an archival repository, send a preservation
data CD for cataloging and preservation. If not,
make several back-up copies.
- After the master audio file has been
transferred and safely backed up, erase and reuse
the flash card or other storage media. Label
write-only (non-reusable) recording media on
the inside clear ring of the disk only, with
felt tip pens approved for CD/DVD labeling. Do
not use adhesive labels. For non-preservation
copies, which are used by or sent out to the
public, CDs or DVDs can be silkscreened with
program information (logo, contact info, copyright
info, etc.) and remaining identifying information
filled in with CD/DVD labeling pens. The
National Institute of Standards and Technology
(NIST) has an online publication called "Care
and Handling of CDs and DVDs: A Guide for Librarians
and Archivists."
(see pages 21 through 26 for guidelines on labeling). Store the media in a
stable environment away from excessive heat, light, and humidity. For audio
tapes, be sure to limit access to the original and only use copies for transcribing
and listening.
|
Each interview should be tracked
using paper documents and a database. The interviewer
should submit the following paper documents to be kept
on file for each interview:
- Interview cover sheet and checklist
- Interviewee life history form
- Proper name form
- Field notes
- Release form
- Transcript or recording index (if interview
will not be transcribed)
- Any additional documentation provided by the
interviewee, such as resume or CV, photographs,
or memorabilia.
Sample forms are available at:
An index card or database record can then created
for each interview, using the information on the
interview coversheet. |
| Of course, creating preservation
and use copies on optical media is only the beginning
of effective longterm digital preservation. Cornell
University's Lab of Ornithology offers a very comprehensive
example of state-of-the-art audio digitization and
preservation methods that can be applied to oral history
as well as field recordings of birdcalls.
The Field
Audio Collection Evaluation Tool (FACET) ] is a very useful, point-based,
open-source software tool that assesses and ranks audio field collections
based on preservation condition, including the level
of deterioration they exhibit and the degree of risk
they carry. Additional resources on digital preservation
best practices are listed below. Additional resources
on digital preservation best practices are listed below.
|
| TOP | HOME | NEXT |
|
|
|