WO1998047252A2 - Personal audio message processor and method - Google Patents

Personal audio message processor and method Download PDF

Info

Publication number
WO1998047252A2
WO1998047252A2 PCT/US1998/007228 US9807228W WO9847252A2 WO 1998047252 A2 WO1998047252 A2 WO 1998047252A2 US 9807228 W US9807228 W US 9807228W WO 9847252 A2 WO9847252 A2 WO 9847252A2
Authority
WO
WIPO (PCT)
Prior art keywords
digital
audio
analog
communication
voice
Prior art date
Application number
PCT/US1998/007228
Other languages
French (fr)
Other versions
WO1998047252A3 (en
Inventor
Geoffrey Stern
Echo Technologies Ltd. Ent
Gil Wexler
Original Assignee
Geoffrey Stern
Gil Wexler
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Geoffrey Stern, Gil Wexler filed Critical Geoffrey Stern
Priority to EP98914674A priority Critical patent/EP1060616A2/en
Priority to JP10544101A priority patent/JP2001503236A/en
Priority to IL13230698A priority patent/IL132306A0/en
Priority to AU68976/98A priority patent/AU6897698A/en
Priority to CA002286043A priority patent/CA2286043A1/en
Publication of WO1998047252A2 publication Critical patent/WO1998047252A2/en
Publication of WO1998047252A3 publication Critical patent/WO1998047252A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M11/00Telephonic communication systems specially adapted for combination with other electrical systems
    • H04M11/06Simultaneous speech and data transmission, e.g. telegraphic transmission over the same conductors
    • H04M11/066Telephone sets adapted for data transmision

Definitions

  • the present invention relates generally to dictation and audio communication devices and, more particularly, concerns a method and portable apparatus for audio communication, including the recording and editing of voice mail and audio content and its transmission and reception over a private or public network, such as the Internet, using common electrical communication media or data links .
  • All electronic message systems with the exception of voice-mail, have intermediate devices or storage media whereby data may be transferred, preferably at a high transmission rate, over a standard communication link and stored in a storage medium or onto an unattended device for later off-line access, review and editing by the intended user.
  • an image is scanned by the transmitter and then transmitted and ultimately printed at a remote site for off-line utilization by the intended receiver.
  • data is generated on a computer and then transmitted and stored either directly on the intended user's unattended computer or on a central host computer linked to a network of computers for subsequent retrieval by the intended user.
  • the most common networks are Local Area Networks (LAN) , a Wide Area Networks (WAN), and public networks, such as the Internet, or private networks.
  • LAN Local Area Networks
  • WAN Wide Area Networks
  • public networks such as the Internet, or private networks.
  • a facsimile may be transmitted to a computer or handheld, paperless fax machine for off-line and independent review by the recipient, such as Reflection Technology, Inc . ' s FaxView personal fax reader.
  • HTML Hyper Text Markup Language
  • Web site graphics utilities such as Web-On Call Voice Browser by Netphonic Communications, Inc. have been introduced which permit users to access the Internet, in response to voice prompts
  • subscription services have been introduced which permit voice mail to be sent to an e-mail address and also permit audio content offered on a Web site to be updated both by way of a standard phone call to an interactive voice response (IVR) system (e.g. "Amail” and "Dialweb” by Telet Communications) .
  • IVR interactive voice response
  • voice processor system manufacturers have established a work group consisting of more than 60% of the world's voice mail system market to develop an Interoperability standard for a Voice Profile for Internet Mail (VPIM) .
  • VPIM Voice Profile for Internet Mail
  • TCP/IP Transmission Control Protocol/Internet Protocol
  • SMTP simple message transfer protocol
  • MIME Multipurpose Internet Messaging Extension
  • content providers are able to broadcast live audio from a Web site (e.g. AudioNet by Cameron Audio Networks) .
  • the telephone link for e-mail and facsimile (e.g. PASSaFAX from RADLinx) is further limited to a hook-up to a local point of presence to access the network.
  • Both e-mail and facsimile contain content which may be outputted by the intended user to a printer, which permits the user to take a hard copy of the material with him for review at his convenience, while he is away from his office or traveling.
  • voice messages and voice-text are currently recorded by the sender and retrieved by the intended recipient primarily in real-time and on-line.
  • a user can use his multimedia notebook computer to record and access a stored audio file or streaming voice file.
  • Off-line access to audio is limited to downloading audio files onto a multimedia computer and having the sound card equipped computer play the audio.
  • a multimedia computer with its screen, keyboard and multipurpose processing capability, is hardly the size of a traditional dictation device or voice recorder. This dependence on a telephone hand set or multimedia computer to create and access audio is analogous to requiring a recipient of a facsimile to view, edit and prepare a facsimile only while in close proximity to a facsimile machine or fax enabled computer.
  • TAD Telephone Answering Device
  • SOHO home-office
  • CTI Network based messages and content
  • Faxes can be accessed as data on a computer screen, data can be accessed as a fax or text-to- speech audio-text and, as automatic speech transcription utilities become more capable, audio will be accessed as printed text in email or fax.
  • audio does not have an input/output device of choice other than a telephone handset or screen/keyboard based multimedia computer, its desirability as a medium of choice will likewise be severely limited.
  • info-text should be the preferred medium for timely data on meetings, speeches and radio broadcasts.
  • voice mail should be the preferred mode of communications when traveling, when communicating through time- zones and when accessing timely information which originated in the spoken word (e.g. minutes of a meeting or lecture) .
  • Voice text i.e. data or text which is spoken by a computer or pre-recorded by a human
  • voice mail In its present state, voice mail is limited to short messages between individuals wishing to communicate in a more substantive fashion at another time (telephone tag) .
  • Voice "mail” becomes limited to voice “messaging” because of the cost and inconvenience to both the sender and receiver of listening to lengthy, content-rich "mail” over the phone or at a multimedia computer” .
  • the Lamer et al. system permits the user to record and playback, transmit (upload) and receive (download) voice messages from a central message facility and over a communication link and onto a portable device; however, the Lamer et al . system requires that a direct telephonic link be established between the portable device and one or more remote central message facilities.
  • the Lamer et al . and Goldberg et al . systems enable the portable device to individually access a traditional, closed, expensive, proprietary voice processing system through a direct communication link.
  • the Lamer et al . system does not provide for a method by which the user may browse available audio content nor for a method to select audio files from a menu for subsequent retrieval by the portable computer device.
  • the Lamer et al . system does not provide for a utility whereby the user may remotely access a central server linked to a network of servers to download control code, search a personal user group or public database for an address other than by way of initiating a dedicated "training" mode by either coupling the portable computer device directly to a computer or by way of detecting and recording DTMF tones generated locally by a standard touch-tone telephone device.
  • Internet Internet access device which enables the user to record, edit and play audio files which may be transmitted and/or received over a public or private network.
  • M-TAD Telephone Answering Device
  • Providing such a portable access device and method would permit TAD owners to encourage inbound callers to leave more robust and data-rich audio messages on their TAD as well as permit TAD owners to subscribe to audio content which could be regularly delivered to their TAD in compressed digital form and downloaded onto the present invention for play-back and review at a convenient time and place. This would also permit TAD owners, while away from their home or office to have their portable dictation and voice message recording/reviewing device establish a telephone link with their TAD and economically and automatically retrieve all stored messages and update all outgoing messages (e.g. general and caller specific greetings) , with all stored messages and outbound greetings being transmitted in digitized and compressed format.
  • outgoing messages e.g. general and caller specific greetings
  • the invention provides a low cost, portable recording and playback dictation and voice message recording/reviewing device which permits the user to record, edit, play and review voice messages including audio-text, text-to-speech and other audio material which may be received from and subsequently transmitted to a remote host computer located on a public or private network over a communication link such as the public switched telephone system.
  • a preferred device contains its own rechargeable power source, integrated circuitry and control buttons to permit the localized recording, editing, storage, playback and transcription of audio signals through a built-in speaker, microphone or plug- in headset, foot pedal and removable memory card.
  • the device also contains a standard RJ-11 telephone jack, modem chip set (or software) , or a removable PCMCIA connector to which a standard or wireless modem card could be connected; and a DTMF tone decoder to permit the transmission and control of audio signals to and from a host computer connected to a public or private network.
  • the device contains circuitry which permits it to transmit and receive audio signals at a rate substantially faster than originally recorded.
  • a preferred device also contains a processor which _ includes the necessary terminal emulation to permit a network user to access a network directly from a local point of access, such as an Internet service provider's (ISP) point of access and shell account, using a standard protocol such as SMTP
  • ISP Internet service provider's
  • a preferred device also contains a standard or touchscreen display and software which permits the user to display a similar graphical editor for composing and reading e-mail messages as is displayed on his computer screen when accessing his e-mail, so that the user can scroll through his e-mail messages, selecting those audio files he wishes to download and selecting text messages he wishes to have converted, either by the network server or at the device, into an audio format (text-to-speech) .
  • an audio format text-to-speech
  • a preferred device also contains : a cradle into which the device may be placed, the cradle having ports which enable it to be connected to a power source to recharge the device ' s batteries; a phone jack to enable it to establish a communication link; and a serial or parallel port on a computer for downloading and uploading files directly to the computer or for receiving "redirected" files.
  • a preferred device also contains a language user interface capable of recognizing and responding to speech with speech.
  • a language user interface capable of recognizing and responding to speech with speech.
  • Such an interface includes speaker independent functions but also permits speaker adaptation which allows the personal device to adjust to the peculiarities of the user's voice or pronunciations and thus improve accuracy.
  • This speaker adaptation is achieved through a protocol which allows the system to adapt to the users voice through the repetition of a set of sentences prior to first use of the device (See Lernout & Hauspie Speech Product's [LHSP] asrlOOO product line) .
  • the language interface includes a vocabulary builder which permits the user to extend the vocabulary including _ special terms and proper nouns to the speech recognition application (see LHSP LextoolTM) , a user template which enables the user to create words which the device will associate with user defined commands e.g.
  • a preferred device also contains public-key encryption technology designed to ensure reliable and secure transmission of sensitive information by encrypting and decrypting the message data and by authenticating the sender's identity by using a secure digital or voice signature.
  • a preferred device also contains a text-to-speech utility which permits the user to download data not already converted to speech by a network server and to do so at the device .
  • a preferred device also contains a bar code reader which permits the user to scan a printed bar code associated with printed matter such as a news article, a map, a menu of available audio files or in a travel guide which would give the device all the information it needs including network server address, file location and file ID so that the audio file associated with the printed matter could be automatically retrieved from a network such as the Internet.
  • a bar code reader which permits the user to scan a printed bar code associated with printed matter such as a news article, a map, a menu of available audio files or in a travel guide which would give the device all the information it needs including network server address, file location and file ID so that the audio file associated with the printed matter could be automatically retrieved from a network such as the Internet.
  • a preferred device also contains a bar code reader which permits the user to scan a printed bar code associated with printed matter such as a news article, a map, a menu of available audio files or in a travel guide which would give the device all the information it needs to play a file from a previously retrieved group of audio files (such as described in Goldberg et al . ) .
  • a preferred device also contains an Infrared interface using a standard such as the Infrared Data Association (IrDA) for high speed local wireless transmission (e.g. 1.2 Mbps and 4Mbps) of audio files and control codes between the device and a public phone, kiosk or the users' computer.
  • IrDA Infrared Data Association
  • a preferred device also includes a software utility called an off-line browser which programs the device to automatically retrieve audio files from the network during off- peak hours to which the user has subscribed, or from selected Web sites which have new audio material available, or from e-mail addresses that the user has programmed the off-line browser to retrieve.
  • a preferred device also includes a software utility which enables the user, by way of a graphical screen based interface or by way of audio prompts, to browse either network databases such as those located on the Internet for addresses and/or sites from which to receive and send audio files.
  • a preferred device also includes a software utility which creates a graphic interface and memory for the user to access, refresh and/or download his E-mail address book containing the E-mail addresses of individuals and groups for which he may wish to prepare and to which he may wish to send audio files.
  • a software utility which creates a graphic interface and memory for the user to access, refresh and/or download his E-mail address book containing the E-mail addresses of individuals and groups for which he may wish to prepare and to which he may wish to send audio files.
  • Such a utility would automatically synchronize the data in the dictation and voice message recording/reviewing device to the data contained in the user's E-mail server account .
  • a preferred device also includes a software utility which creates a graphic interface and memory for the user to organize his/her telephone numbers, E-mail addresses, calendar, reminders and appointments including a clock and alarm function with an option to choose between a simple audible sound alarm or a programmed voice message alarm (e.g. "call home") .
  • a preferred device also includes a software utility which enables the user to download proprietary client server software systems and upgrades and newly introduced standards for low bit rate speech compression made available over a public or private network such as the Internet to insure that the device may use the latest state-of-the-art audio compression software.
  • a preferred device also includes a software utility which enables the user to download proprietary client server software systems and upgrades and newly introduced standards which enable the device to receive highly compressed and/or streaming audio files containing voice content including, but not limited to application program interfaces (APIs) which enable the device to be used as a portable Internet Phone appliance to conduct a real-time, two-way, full-duplex voice conversation using a local connection to the Internet.
  • APIs application program interfaces
  • a preferred device also includes a software utility which extends the functionality of a Web program run from a Web browser and operate on data such as audio data as it flows in the user's PC, permitting the user to redirect audio files by the communication port directly to the device seated in a cradle and connected to the serial or parallel port .
  • a software utility which extends the functionality of a Web program run from a Web browser and operate on data such as audio data as it flows in the user's PC, permitting the user to redirect audio files by the communication port directly to the device seated in a cradle and connected to the serial or parallel port .
  • OLE Object Linking and Embedding
  • Web software which when activated by the user by pressing a designated key such as print, redirects audio files directly to a special "printer" driver dedicated for the device.
  • the utility permits users who are browsing the Web on their computers to download audio files directly to their personal audio servers for later access, without having to transfer from their hard disc.
  • a preferred device also includes a software utility which enables the user to select E-mail messages and request that the messages be converted from text-to-speech by an appropriate text-to-speech conversion application available to the network, and only subsequently digitized and transmitted as digitized and compressed audio file.
  • the invention also relates to a method and software utility using DSVD (Digital Simultaneous Voice/Data) and/or the VoiceView protocols (Radish Communications Systems, Inc.) which would enable the user, once connected to a communication link to be able to transfer and receive audio files directly into a dictation and voice message recorder Device simultaneously _ ,_
  • DSVD Digital Simultaneous Voice/Data
  • VoiceView protocols Radish Communications Systems, Inc.
  • the invention also relates to a method and software utility which permit the scalability of digitized audio files in order to conform with network server requirements and or user preferences. This would enable the server to demand or the user to request a lower compression rate or slower transmission speed in order to have higher fidelity for the audio file requested, and vice versa.
  • a recording device may be left connected to a communication link and programmed to dial into and to connect to a local network access point at off-peak hours when telephone rates are lowest and when excess capacity on incoming lines is available.
  • the recording device is programmed to search the network for audio files to which the user has a subscription, new audio files available from Web sites to which the user has programmed the device to look, and for audio mail sent to the user from selected E-mail addresses.
  • an interface port such as a standard RJ-11 telephone jack
  • the recording device may be connected between a telephone set, computer, cellular phone or personal digital assistant and a communication link to enable the user to select and retrieve voice files while using any of the above devices.
  • circuitry is provided for the digital conversion and compression of the analog voice signals recorded in the memory of a dictation and voice message recording/reviewing device to permit high density storage and high speed transmission of digitized voice.
  • circuitry is provided for the analog conversion and natural sounding playback of previously stored or received digitized voice.
  • a public terminal e.g. in a manner similar to an automated teller machine and located at places such as airports and tourist sites where a user could connect his recording/reviewing device and select voice messages and audio- text to be retrieved and transmitted directly by the recording/reviewing device.
  • Figure 1 is a schematic block diagram of a preferred personal audio message processor embodying the present invention.
  • Figures 2-7 are flow charts illustrating how certain processing is performed in the apparatus of Fig. 1.
  • FIG. 1 is a schematic block diagram of a presently preferred Personal Voice Server (PVS) system 10 embodying the present invention.
  • PVS system 10 broadly comprises five main parts: a highly integrated DSP/RISC integrated chip 11 (DSP stands for Digital Signal Processor and RISC stands for Reduced Instruction Set Computer); a Telecom/Audio Codec 17; a memory such as SDRAM 12 and/or Flash Memory 13 coupled to the DSP chip; peripherals such as a microphone 26, a speaker 18, a touchscreen/display LCD 19, an infrared I/O 21 and a Barcode reader 15.
  • DSP Digital Signal Processor
  • RISC Reduced Instruction Set Computer
  • Telecom/Audio Codec 17 a Telecom/Audio Codec 17
  • memory such as SDRAM 12 and/or Flash Memory 13 coupled to the DSP chip
  • peripherals such as a microphone 26, a speaker 18, a touchscreen/display LCD 19, an infrared I/O 21 and a Barcode reader 15.
  • Operating system software is also provided to manage the DSP to handle modem routines such as V32bis, V34 etc., voice recognition, echo cancellation and speech synthesis; software also controls the system via the RISC part of chip 11.
  • modem routines such as V32bis, V34 etc., voice recognition, echo cancellation and speech synthesis
  • software also controls the system via the RISC part of chip 11.
  • the embodying device 10 is referred to ⁇ as a voice server, it should be clear that it is equally useful for other types of audio, including music.
  • the DSP chip is preferably a Philips Semiconductor PR31100 chip, which contains a MIPS R3000 RISC CPU core with 4 Kbytes of instruction cache and 1 Kyte of data cache, plus various integrated functions for interfacing to numerous system components and external i/o modules.
  • the chip also has a hardware multiply/accumulate unit to perform DSP functions, such as a software fax/modem which eliminates the need for an external modem chip set .
  • the chip also has a UART (Universal Asynchronous Receive Transmit) interface 22 (shown separately) , which permits the device to be connected to an external modem or other device (such as a modem equipped Telephone Answering Device) through a conventional RS232 serial connector 23.
  • UART Universal Asynchronous Receive Transmit
  • the PR31100 also contains multiple DMA (direct memory access) channels and a high-performance, flexible Bus Interface Unit (BIU) for providing an efficient means for transferring data between external system memory, cache memory, the CPU core, and external I/O modules.
  • the PR31100 also contains a System Interface Module (SIM) , which provides integrated functions for interfacing to various external I/O modules, such as a liquid crystal display (LCD) 19, an infrared I/O module 21, and the Codec 17.
  • SIM System Interface Module
  • Codec 17 is preferably a Philips UCB 1100 single chip integrated mixed signal audio and telecom codec, which handles most of the analog functions of the system, including the sound and telecommunications codec (analog/digital coding and decoding) functions and touchscreen analog-to-digital conversion, ISDN/high-speed serial, infrared, and wireless peripherals.
  • the high-speed serial interface 14, although shown separately in Fig. 1, is actually part of the UCB1100.
  • the chip has a single channel audio codec which is designed for direct connection of a microphone and speaker (i.e. components 16 and 28 are actually part of the UCB1100) .
  • the built-in telecommunications codec can be connected directly to a conventional RJ-ll jack 20 for connections to a telephone line.
  • the operating systems software for the PR31100 is preferably Eden OS version 2.0, commercially available from the Eden Group Limited of Cheshire, England. This operating system is specifically designed to support the PR31100 (also known as DINO) and the UCB1100 (also known as BETTY) .
  • a data sheet for the Eden OS is attached, which describes the software support and the drivers provided by the operating system. This data sheet is incorporated in the present description by reference.
  • Memory 12, 13 is used to store messages and to hold temporary data.
  • the flash memory is configured according to the amount of permanent programs required, including operating system (O/S) and application software and also to store some of the recorded messages.
  • audio compression provided in the PR31100 will result in a data bandwidth of less than half a Kbyte per second (i.e. 1Mbyte of memory will provide an hour of audio . )
  • a microphone 26 and speaker 18 are selected based on quality and size.
  • Flow diagrams are presented in Figs. 2 - 7 to describe the operation of retrieving messages over the Internet and transmitting them to and from the PVS as well as the various operational options for dialing, receiving data from a given server address in the Internet, storing, screening, retrieving, transmitting and playing messages to/from the PVS.
  • These operations include receiving compressed messages in digital form and audio signals in analog form bi-directionally from speaker/microphone and phone connection.
  • Figures 2a and 2b comprise a flow chart illustrating how the PVS connects to a location on the Internet by Transport Protocol and how the PVS gets all data relating to its Web/email site (e.g. HTML language displaying information) and receives/stores messages (audio, data etc.) that were sent using either a proprietary or de-facto standard (e.g. highly compressed audio at 2.5 kbps) .
  • Web/email site e.g. HTML language displaying information
  • receives/stores messages audio, data etc.
  • messages audio, data etc.
  • a proprietary or de-facto standard e.g. highly compressed audio at 2.5 kbps
  • Figs. 2a and 2b The operation depicted in Figs. 2a and 2b is run concurrently by the real-time kernel of the DSP/RISC (discussed further below with reference to Fig. 3) . It enables multiple tasks to be run and executed in Parallel. Operation of the main task begins at block 200. Accessing a site and storing or receiving stored messages is run concurrently with other tasks. These tasks can be local to operate the PVS, or other tasks such as the operation of the bar-code reader, voice synthesizer, voice recognition, or to access other Web sites by PPP at the same time.
  • a test is performed to determine whether the desired operation is connection to a network access provider via an out-bound call (at block 210) . If not, the modem, in response to a ring, answers the call, completes its handshake procedure, and begins receiving information (block 204) . Data bits from the modem are received by DSP chip 11 at block 220. The DSP chip decodes the incoming data at block 230.
  • a test is performed to determine whether the desired operation is to decode an HTML site. If not, control transfers to block 340. Otherwise operation continues at block 250, where the display of the site page begins.
  • a test is performed at block 260 to determine whether the mode of operation is interactive or automatic. In the interactive mode, the user of the PVS has to browse and select the desired operation to be completed. In automatic mode, the keyword (s) to retrieve audio or other messages are searched for and activated automatically to get the compressed data. If the test at block 260 senses the interactive mode, control is transferred to block 110 in Fig. 2b. If not, automatic browsing is done starting at block 270 to search for a highlighted keyword symbol.
  • a test is performed to determine whether the keyword constitutes a request for a previously digitized message and if so, the data compressed by FTP protocol is received by the PVS at block 290. If the test at block 280 results in a "no", control transfers to the block 310.
  • a test is performed to determine whether no more messages exist, and, if so, control returns to block 100. Otherwise, a test occurs at block 320 to determine if the keyword constitutes a request for a place to store local messages at the web server. If so, this data, such as a compressed audio messages, is transmitted from the PVS to the web site (block 330) . If not, control returns to the start (block 100) . The process is continued until there are no other stored messages for the PVS owner at this Web site.
  • a test is performed to determine whether this cite is utilizing the FTP protocol language. If so, a message is retrieved utilizing FTP (block 360) , and it is stored at block 380 and control is transferred to block 120 in Fig. 2b. If it is determined at block 340 that FTP protocol is not being used, a test is performed at block 340 to determine whether or not a recognized access language is being received. If so, a message is retrieved at block 360 utilizing the recognized access language and is then stored at block 380. Control is then transferred to block 120 in Fig. 2b. If a recognized access language is not found at block 350, the user is notified at block 370 and control returns to block 100.
  • control is transferred to block 110 in Fig. 2b.
  • the keywords in the web page are selected and, at block 114 HTML interpretation is activated to locate the messages in the pool.
  • messages are then sent and/or received and control is returned to block 100 in Fig. 2a.
  • control is transferred to block 120 in Fig. 2b.
  • Any data which is stored causes the creation of data in a flat database (block 120) , which may be searched to locate the data at a later time.
  • the message is an audio message, it is decompressed and played at the same time that it is transmitted by FTP protocol.
  • the test at block 122 determines whether such action is necessary for the current message and, if so, decompression and the audio synthesizer are activated (block 124), the database is updated to reflect that the message is ready to be synthesized, and control is returned to block 100.
  • block 122 transfers control to block 128, where a test is performed to determine if the message is to be sent to the web server and, if not, control is returned to block 100. If the message is to be sent to the web server, it is sent by FTP at block 130, and the user is notified upon completion of the transfer (block 132) , after which control returns to block 100.
  • the kernel is multitasking, in that it can run multiple programs or tasks concurrently, with each one having its own priority and being capable of initiating other (child) tasks.
  • operation starts at the idle mode at block 480, where the PVS waits for events to occur, and when one occurs it is handled at block 430. Every program interacts with the operating system this way, by having its tasks attended to at block 430.
  • the type of events that arise are either synchronous or asynchronous.
  • processing of the synchronous events is initiated via connector 5. Otherwise, a test is performed at block 450 to detect an asynchronous event, in which case processing of the asynchronous events is initiated via connector 6.
  • the operating system After processing is initiated the operating system returns to the idle mode to process other events.
  • Another special event to occur is error handling at block 460.
  • a test is performed at block 460 to detect a failure event and if there is none, the program returns to the idle mode.
  • an error event is detected at block 460 and a run time handler is issued (block 470) and handles the event. Control then returns to the idle mode.
  • the synchronous and asynchronous events identified in Fig. 3 are only exemplary and it is contemplated that there may be others of each type.
  • Figure 4 is a block diagram illustrating the routine performed by the controller of DSP/RISC Chip 11 when an analog _ audio message is to be recorded.
  • a test is performed to determine whether the incoming messages are from the built-in microphone. If not, control is transferred to the routine of Fig. 5. If so, the audio message is digitized and compressed (block 720) and placed in the working pool of data (block 730) .
  • a test is performed to determine whether memory was filled before an entire message was stored. If not, the routine is terminated, and control returns to the idle mode. If so, recording is disable (block 750) , and the operator is notified, as by warning light, that the memory is full (block 760) . Control reverts to the idle mode.
  • Figure 5 is a block diagram illustrating the routine performed to record analog audio from the telephone line .
  • a test is performed to determine whether an audio message being received is from the communications link (telephone line) . If not control is transferred to the routine of Fig. 6. If so , the message is passed through the Telecom/Audio Codec 17 as audio (block 810) , and a test is performed at block 820 to determine whether compression is to be performed by the DSP/RISC Chip. If so, the message is stored in local memory (block 830) , recording is stopped, and control is returned to the idle mode. If compression is not to be performed by the DSP/RISC Chip ⁇ .
  • the message is sent to the Telecom/Audio Codec, which compresses it by a standard (ADPCM) algorithm (block 840) .
  • the message is then sent back to the DSP/RISC 11 through its UART (block 850) , and the DSP/RISC chip control that causes the message to be stored I flash memory 13 (block 860) . Control is then returned to the idle mode .
  • Figure 6 is a block diagram of the routine performed by the Audio/Telecom Codec controller to play stored audio through the built-in speaker.
  • the operator selects a message from the pool of messages stored in the device.
  • a test is performed to determine whether stored message to be read was originally compressed by the audio/telecom codec. If not, control is transferred to block 920. If so, the message is read and decompressed using the audio/telecom codec (block 930) , and the decompressed message is applied to the digital-to-analog converter (DAC) in the audio/telecom codec (block 940) .
  • the message is the played via the built-in speaker 18 through the D/A converter and amplifier 28 (block 950) , and control is returned to the idle mode.
  • a test is performed at block 920 to determine whether the stored message was originally compressed by the audio/telecom codec. If not, the user is notified (block 960) , and control is returned to the idle mode. If so, the message is read by the controller (block 970), and it is then sent to the modem to be decompressed and then returned from the modem to memory 13 through the UART port of the audio/telecom codec 17 (block 980) . Control is transferred to block 940, and playback is handled in the same manner as a message originally compressed by the Audio/telecom codec.
  • FIG. 7 is a schematic illustration of how the PVS, connected to its cradle may be connected to a PC (whether multimedia or not) or to a specially configured TAD with a built-in modem in order to permit a PC or TAD user (A) to send or receive a voice file from or to the PVS through a modem other than the telecom/audio codec of the PVS.
  • a PC or TAD user A
  • the same configuration would permit a non- multimedia PC user (B) to play audio files by using the PVS ' s multimedia capabilities to play audio files received over the non-multimedia PC's modem.
  • This configuration would likewise permit the PC user (C) to record audio through the PVS ' s built- in microphone and transmit it through the PC's modem as files or streaming audio.
  • Such a configuration would also permit the user of a PC (D) to redirect audio files directly to the PVS while using a standard Web browser program.
  • a similar configuration with a modem configured TAD would permit the TAD user to download audio messages to and from the TAD to the PVS.
  • Bi-directional communication from the PC to the PVS is handled by a communication cable (e.g. 9 pin connector) at the PC and the serial RS232 port on the PVS and controlled by the asynchronous event software controlling input/output from the UART communication interface .
  • the software at the PC handles the driver for sending/receiving data to/from the PC to the PVS. For sending data, this would be similar to a PC sending data to a fax or printer, and for receiving data, this is similar to a PC receiving data from a scanner.
  • This driver sets all required parameters for the PVS such as type of operation, length and wait for acknowledgment and "End of Transmission” .
  • the PC also handles the software to use the PVS as an attachment (peripheral) for receiving multimedia audio messages so that the speaker on the PVS will operate.
  • the PC also handles the software to manage the microphone input of the PVS, and software to integrate with a standard Web Browsers (e.g. Netscape Navigator) to be fully integrated with the software and invoke commands to the PVS accordingly.
  • a standard Web Browsers e.g. Netscape Navigator
  • the software in the PVS is part of the multitasking operating functions to handle Remote activation of Procedural Calls (RPC) controlled under the asynchronous events software of the PVS.
  • RPC Procedural Calls
  • PR31100 Processor is a single-chip, low-cost, integrated embedded • 32-b ⁇ t R3000 RISC static CMOS CPU processor consisting of MIPS R3000 core and system support logic to interface with various types of devices. • 4 KByte instruction cache
  • PR31100 consists of a MIPS R3000 RISC CPU with 4 KBytes of • 1 KByte data cache instruction cache memory and 1 KByte of data cache memory, plus
  • R3000 RISC CPU is also augmented • On-chip peripherals with individual power-down with a multiply/accumulate module to allow integrated DSP - Multi-channel DMA controller functions, such as a software modem for high-performance standard data and fax protocols.
  • PR31100 also contains multiple DMA - Bus interface unit channels and a high-performance and flexible Bus Interface Unit - Memory controller for ROM, Flash, RAM, DRAM, SDRAM, (BIU) for providing an efficient means for transf e ng data between SRAM, and PCMCIA and or MagicCard external system memory, cache memory, the CPU core, and - Power management module external I O modules.
  • Video module supported include dynamic random access memory (DRAM), - Real-time dock 32.760KHz reference ynchronous dynamic random access memory (SDRAM), static random access memory (SRAM), Rash memory, read-only memory - High-speed serial interface (ROM), and expansion cards (PCMCIA and/or MaglcCard).
  • DRAM dynamic random access memory
  • SDRAM Real-time dock 32.760KHz reference ynchronous dynamic random access memory
  • SRAM static random access memory
  • Rash memory read-only memory - High-speed serial interface (ROM), and expansion cards (PCMCIA and/or MaglcCard).
  • PR31100 also contains a System interface Module (SIM) containing - Dual-UART Integrated functions for interfacing to numerous external I/O - SPI bus modules such as liquid crystal displays (LCDs), the UCB1100 (which handles most of the analog functions of the system, including sound • 3.3V supply voltage and telecom codecs and touchscreen ADC), ISDN/high-speed • 208-pm LQFP (Low profile quad flat pack) serial, infrared, wireless peripherals, Magicbus, etc.
  • SIM System interface Module
  • LQFP Low profile quad flat pack
  • Figure 1 shows an External Block Diagram of PR31100.
  • BlU Module programmable, using breakpoint address, mask, control, and
  • MagicCard or - handles data bus, address bus, and control interface between general purpose chip selects CPU core and rest of PR31 00 logic available for (future) MagicCard expansion memory
  • PR31100 provides the chip select ana card detect signals
  • Clock Module supports card insertion/removal timeouts •
  • PR31100 supports system-wide single crystal configuration, MagicCard requires minimal number of unique control status besides the 32 KHz RTC XTAL (reduces cost, power, and board space) signals per port
  • SPI System Peripheral Interface
  • power management state machine has 4 states RUNNING module DOZING, SLEEP, and COMA • UART-B port used for general purpose senal control interface
  • Serial Interconnect Bus (SIB) Module • UART-A and UART-B DMA support for receive and transmit
  • SIB Serial Interconnect Bus
  • PR31100 contains holding and shift registers to support the senal interface to the UCB1100 and/or other optional codec devices Video Module
  • each SIB frame consists of 128 clock cycles, further divided into 2 • supports split and non-split displays subframes or words of 64 bits each (supports up to 2 devices
  • Figure 2 shows a typical system block diagram cosisting of PR31100 and UCB1100 for a total system solution.
  • the UCB1100 is a single chip, integrated mixed signal audio and GENERAL DESCRIPTION 2 telecom codec.
  • the single channel audio codec is designed for APPLICATIONS 2 direct connection of a microphone and speaker.
  • the built-in telecom KEY FEATURES 2 codec can directly be connected to a DAA and supports high speed TABLE OF CONTENTS 2 modem protocols.
  • the UCB1100 has a serial interface bus (SIB) intended to 6.0 FUNCTIONAL DESCRIPTION 8 communicate to the system controller. Both the codec input and 6.1 AUDIO CODEC 8 output data and the control register data is multiplexed on this SIB 6.1.1 AUDIO INPUT SPECIFICATIONS 10
  • PIC Personal Intelligent Communicators
  • PDA Personal Digital Assistants
  • the telecom codec is 9.1 PACKAGE OUTLINE LQFP48 34 intended for direct connection to a DAA (digital access 10.0 DEFINITIONS 36 arrangement) and includes a built-in sidetone suppression circuit.
  • SIB 4 wire serial interface data bus
  • EROS Erden Real-time Operating System
  • the operating system consumes resources in the form of ROM and RAM in the product, these resources add to the BOM costs of the product and any space occupied by the OS must be justified.
  • EROS is designed to be small.
  • the modularity is also a feature which supports the compactness of the operating system; where individual products do not need a feature it can be omitted or replaced by some subset, leaving more room for the visible components that add features and thus perceived value.
  • Open an open OS will be more likely to attract 3 ri party developers looking to design software products for sale, so allowing more value in the form of available features to be added to products based on the OS.
  • EROS has a published API and a PC-based SDK which supports the development of applications in a readily available development platform.
  • EROS 99% written in ANSI C, porting to a new processor and/or tailoring to a specific product design is sufficiently simple and predictable for this to be completely acceptable within a product development lifecycle.
  • EROS offers the same application interface on each platform, allowing applications to run on any EROS platform.
  • EROS application development is carried out on a PC SDK incorporating a subset of the target OS.
  • Eden will adopt the GNU toolset for the development of EROS itself and support this toolset for all targets.
  • EROS The components of EROS are:
  • ARK This is the core of EROS; based on the ITRON 3 specification and extended, this supports pre-emptive, prioritised multitasking, message queues, semaphores, rendezvous ports, event flags and interrupt handling.
  • VMM Virtual Memory Management
  • EVE Eden's Visual Environment
  • ADAM Advanced Database Access Module
  • the EROS clipboard supports copy, cut and paste and drag and drop. It does this by allowing applications to set-up self-describing data items which can then be passed between applications which have no knowledge of each other.
  • EROS' file system is built as a number of layers, allowing multiple filing systems to be supported (typically a DOS- compatible filing system on PC-cards and a Flash-oriented for built in non-volatile storage) without the applications being aware of such details.
  • PC card services EROS supports SRAM, Flash and ATA drives as storage and data exchange devices.
  • the PC card services offer a key set of facilities allowing support for specific card types to be developed as necessary.
  • EROS' Device Manager supports the dynamic addition of device drivers and allows handler tasks to establish a connection to whichever is the most appropriate driver.
  • TCP/IP EROS supports TCP/IP, SLIP and PPP. A number of higher levels protocols are supported as standard within the OS including UDP, FTP, SMTP,
  • Linking and Loading Embedded systems are typically provided as a single ROM containing the operating system and all the applications. The addition of new applications and the correction of those supplied in ROM is difficult. Flash memory is used, but the mechanisms for upgrade and addition are usually clumsy.
  • EROS makes use of a Dynamic Linker Loader (ELF) to overcome much of this difficulty.
  • ELF Dynamic Linker Loader
  • EROS itself and built-in applications are installed in ROM but their external linkage symbols are loaded into RAM during start-up. Patches can be installed so that later in the start-up sequence some of these symbols are changed to point to new code, thus avoiding the obselete areas of code in ROM.
  • applications which are loaded dynamically are linked to this symbol table and so use the correct built-in and patched code.
  • the OS structure supports OEMs and application developers in providing a framework within which applications can be constructed which are easily ported from language to language and from country to country, with little or ideally no change to software.
  • Embedded applications are often battery powered and hence power use is critical. While the degree of support offered by particular processors and products will vary, EROS supports an API which allows applications to be contructed in a power-sensitive manner and supports the specific attributes of particular platforms in an appropriate manner.
  • Application Interface Any application program interacts with EROS through the Application Program Interface (API). At the prograrnming level these appear as function calls. These functions are primarily in the form of 'helpers' which execute as part of the application task and exchange information with one or more EROS tasks before returning to the application code. Responses and other input from EROS are provided by messages sent to the task's input queue or, for so-called 'blocking' calls, by the helper function using a 'rendezvous' for the exchange. Application tasks are usually structured as a single message handling loop which takes messages from a message queue.
  • API Application Program Interface
  • EROS includes a set of tools to enable applications to be developed for EROS platforms. Such applications will usually be platform (processor) and product independent, subject to appropriate devices being available to handle the interfaces.
  • the toolset comprises:
  • EROS for the target platform is supplied in the form of shared libraries making up the 'helper' functions, object code for the EROS tasks and an initial startup sequence to be modified by or on behalf of the OEM.
  • EROS supplies a skeleton start-up sequence (above) for each target platform; extending this is a product-specific task.
  • Non-standard devices EROS has a device handling architecture which supports the addition of new device handlers.

Abstract

A portable device is disclosed which permits the user to record, edit, play and review voice messages and other audio material which may be received from, and subsequently transmitted to, a remote voice processing or interactive voice (IVR) host computer over a communication link. A preferred device contains its own power source, integrated circuitry (11) and control buttons to permit the localized recording, editing storage and playback of audio signals through a built in speaker (18), microphone (26) and removable memory card. The device also contains a standard RT-11 telephone jack (20), modem chip and control of audio signals to and from a host computer. The device contains circuitry which permits it to transmit and receive audio signals at a rate substantially faster than originally recorded.

Description

PERSONAL AUDIO MESSAGE PROCESSOR AND METHOD
Field of the Invention
The present invention relates generally to dictation and audio communication devices and, more particularly, concerns a method and portable apparatus for audio communication, including the recording and editing of voice mail and audio content and its transmission and reception over a private or public network, such as the Internet, using common electrical communication media or data links .
Background of the Invention
All electronic message systems, with the exception of voice-mail, have intermediate devices or storage media whereby data may be transferred, preferably at a high transmission rate, over a standard communication link and stored in a storage medium or onto an unattended device for later off-line access, review and editing by the intended user.
In the case of a facsimile transmission, an image is scanned by the transmitter and then transmitted and ultimately printed at a remote site for off-line utilization by the intended receiver. In the case of electronic mail, data is generated on a computer and then transmitted and stored either directly on the intended user's unattended computer or on a central host computer linked to a network of computers for subsequent retrieval by the intended user. The most common networks are Local Area Networks (LAN) , a Wide Area Networks (WAN), and public networks, such as the Internet, or private networks. When the intended user accesses his computer, either the E-mail is already resident, or he finds a message displayed in a graphic editor indicating that he has mail and how he can retrieve it. Once the E-mail is retrieved, it likewise may be read, reviewed and manipulated by the intended user off-line on the users' computer. Alternately, it may be outputted to a printer, providing the user a hard copy for review at his convenience .
When a facsimile machine is unavailable, a facsimile may be transmitted to a computer or handheld, paperless fax machine for off-line and independent review by the recipient, such as Reflection Technology, Inc . ' s FaxView personal fax reader.
Utilities exist for both facsimile and E-mail messages, whereby messages may be selected from a host by an authorized user for subsequent transmission to the user's E- mail address or unattended facsimile machine. See, for example, Duehren et al . , U.S. Patent No. 4,918,722.
Recently, with the widespread and growing usage of the Internet and, more particularly, with the growing popularity of WEB sites offering published material in the form of HTML (Hyper Text Markup Language) documents, utilities have been created which permit such files to be selected for subsequent off-line access and independent review by fax. See, for example, FactsLine for the Web, by Ibex Technologies, Inc. Such a utility makes the large volume of information and graphics offered over the Internet, available to users who either do not have access to a computer connected to the Internet, or wish to limit the amount of time spent on-line.
A large percentage of potential users do not have access to the Internet, or even if they do; may be traveling; may not have access to their computers; or may not wish to spend time booting their computer and waiting for Web site graphics (utilities such as Web-On Call Voice Browser by Netphonic Communications, Inc. have been introduced which permit users to access the Internet, in response to voice prompts) , to navigate to a document or E-mail of interest, to identify a document by number and to have a selected document read in real-time over the phone using text synthesizing voice and faxed back or sent as an e-mail attachment.
Similarly the widespread use of the Internet and heavy traffic to particularly popular Web sites or during particular peak usage times has created a demand for utilities called off-line browsers which permit Internet users to "subscribe" to particular Web sites from which their computer then automatically retrieves material during off-peak hours, _ categorizes and organizes new and updated information and permits the user to review it off-line using his browser of choice (e.g. FreeLoader by FreeLoader, Inc.) .
Similarly, subscription services have been introduced which permit voice mail to be sent to an e-mail address and also permit audio content offered on a Web site to be updated both by way of a standard phone call to an interactive voice response (IVR) system (e.g. "Amail" and "Dialweb" by Telet Communications) . Recently, voice processor system manufacturers have established a work group consisting of more than 60% of the world's voice mail system market to develop an Interoperability standard for a Voice Profile for Internet Mail (VPIM) . TCP/IP (Transmission Control Protocol/Internet Protocol) has been selected as the vehicle of conductivity, because of its globally accessible points of contact, primarily on the Internet, and because of its use of commonly recognized transmission protocols, specifically simple message transfer protocol (SMTP) and Multipurpose Internet Messaging Extension (MIME) as the core of VPIM. (see April 29 1996 issue of Business Wire) . Once implemented, interoperable standards such as VPIM will permit voice mail users to send and receive their voice messages over the Internet or an Intranet as easily as they can now do so over the telephone. In addition to voice messaging and audio e-mail over the Internet, the recent introduction of proprietary client server software systems permits users with conventional multimedia personal computers and voice grade telephone lines to browse, select, and play back audio or audio-based multimedia content in real-time streams (RE) or download on- demand (REM) . An interested user need only download software from the content provider's Web site to access such audio content (e.g. Progressive Network's RealAudio Player and Server) . Systems such as this represent a real breakthrough, since in the past, delivery of audio by conventional on-line methods downloaded it at such low rates that acquiring the information took five times as long as the actual program. _ This required the listener to wait 25 minutes before listening to 5 minutes of audio.
As a result of the availability of streaming audio over the Internet, a number of companies have introduced Internet telephone products which permit users having multimedia computers programmed with proprietary software to talk in real time over the Internet (see Voclatec) . Such a system is useful over long distances when users can access a local Internet access point or point of presence, making a long distance call into a local call.
Similarly, as a result of streaming audio over the Internet, content providers are able to broadcast live audio from a Web site (e.g. AudioNet by Cameron Audio Networks) .
Recently a standard-based implementation for communication over the Internet has been introduced, and supported by Intel and Microsoft, which makes use of the DSP Group's TrueSpeech G.723 compression technology. This uses an advanced algorithm that results in excellent voice quality, despite a high compression ratio, and operates at 6.3 kilo bits per second (kbps) and 5.3 kbps with compression ratios of 20:1 and 24:1, respectively. It also includes silence compression which can bring the effective rate down to less than 3.7 kbps at 28.8 kbps modem speed. This would permit the transmission of audio at a rate of 1:7.78 or 10 minutes of audio in 1.3 minutes .
Using Texas Instrument's C80 DSP chip using a V.34 modem running at 28.8 kbps, a transmission rate of audio at a rate of 10:1 (ten minutes of speech in 1 minute of transmission) can be achieved with telephone grade sound quality. From the above, it is apparent that while the transfer of data, graphics and audio messaging and content over a network has become more widespread and convenient, this growth has also highlighted certain historic shortcomings associated with the transfer and input/output of voice messaging and audio content . As voice messaging and audio content become more available, the deficiency created by the lack of an intermediate device or storage medium for such audio will become more pronounced. For both E-mail and facsimile, use of a telephone link is limited to the transmission of the data and the transmission of control codes for that data. With the growth and widespread usage of network computing, the telephone link for e-mail and facsimile (e.g. PASSaFAX from RADLinx) is further limited to a hook-up to a local point of presence to access the network. Both e-mail and facsimile contain content which may be outputted by the intended user to a printer, which permits the user to take a hard copy of the material with him for review at his convenience, while he is away from his office or traveling.
In sharp contrast, voice messages and voice-text are currently recorded by the sender and retrieved by the intended recipient primarily in real-time and on-line. At best, a user can use his multimedia notebook computer to record and access a stored audio file or streaming voice file. Off-line access to audio is limited to downloading audio files onto a multimedia computer and having the sound card equipped computer play the audio. However, a multimedia computer, with its screen, keyboard and multipurpose processing capability, is hardly the size of a traditional dictation device or voice recorder. This dependence on a telephone hand set or multimedia computer to create and access audio is analogous to requiring a recipient of a facsimile to view, edit and prepare a facsimile only while in close proximity to a facsimile machine or fax enabled computer. Not being able to prepare, review and access network based voice mail other than in realtime from a telephone hand set or off-line from a multimedia computer, severely limits the desirability of integrating voice messaging and audio content into network based messaging. There exist no dedicated and portable devices to store network based voice messaging and likewise there exists no method or _ utility to scan and select personal voice messages or public announcements from a host connected to a network for subsequent high speed transmission to a device for subsequent off-line review by the user.
The only dedicated device which permits the user to review his/her voice messages off-line is the Telephone Answering Device (TAD) which is primarily a residential or small-office, home-office (SOHO) appliance which uses digital recording technologies to replace the standard functions of a traditional tape-based answering machine. The TAD, plugged into both an electrical outlet and phone jack is not portable, so the user must either be within hearing distance of the TAD ' s speaker or, using a telephone, may call in to retrieve his/her messages on-line and in real-time. While traditionally, TAD ' s have offered very limited outbound messaging capabilities, whatever outbound messaging was offered required that the owner record any outbound message (e.g. a general greeting or caller- specific/mail box-specific message) either from within range of the microphone on the TAD or from a real-time telephone call. Voice messaging, whether network based or TAD based, limited to on-line and real-time transmission and physically requiring access to a telephone set, TAD or multimedia computer is unfortunate, particularly because voice communication inherently does not require any external hardware or instrumentation other than the mouth and ear for a human being to create or access it. Speech is the most natural and self- sufficient form of communication. Speech is hands-free requiring neither writing instrument, keyboard, screen, dedicated vision or hand-to-eye coordination on the part of the user to input or retrieve. That voice mail is nonetheless so widely used is more a function of speech's unique characteristics than a vote of approval on the adequacy of the current technology. Similarly, that so many innovative utilities have been introduced which make audio and voice available over public and private networks is a commentary on the compelling nature of audio and voice for content, messaging and issuing commands and only underscores the need to make audio and voice more easily available. Until such time that voice messaging and audio content are made more accessible, many of the network based audio utilities mentioned above will remain novelties for technophiles .
Much has been said about Computer Telephone Integration
(CTI) and the Universal Mail Box, where network based messages and content may originate in any medium and by any input device of choice and, likewise, may be retrieved in any medium or by any output device of choice . Faxes can be accessed as data on a computer screen, data can be accessed as a fax or text-to- speech audio-text and, as automatic speech transcription utilities become more capable, audio will be accessed as printed text in email or fax. However, as long as audio does not have an input/output device of choice other than a telephone handset or screen/keyboard based multimedia computer, its desirability as a medium of choice will likewise be severely limited.
Since speech is a direct record of the user's voice, the urgency, meaning and emotional content is never lost. Similarly, since so much data is first generated in voice and is only later transcribed to text or data, info-text should be the preferred medium for timely data on meetings, speeches and radio broadcasts. Ideally, voice mail should be the preferred mode of communications when traveling, when communicating through time- zones and when accessing timely information which originated in the spoken word (e.g. minutes of a meeting or lecture) . Voice text (i.e. data or text which is spoken by a computer or pre-recorded by a human) should be the preferred format for messaging information to be accessed where use of motor skills and vision are not convenient or are impaired such as when driving, operating equipment or engaged in a leisure activity. The current use of a telephone to access voice messages directly has significantly limited the potential utilization of voice messaging. Real-time transmission of voice messages _ and info-text makes the recording and retrieval of voice mail, especially from long distances, very costly. The cost and inconvenience involved means that one cannot compose and review voice mail and info-text in a cost efficient manner and at one ' s own pace. One is limited to a location and situation in which a telephone is accessible and, in the case of a wireless communication link, to a place where wireless transmission is both possible and desirable. The application of multimedia computers to compose and review voice mail has had little effect on making voice messaging more convenient since the use of keyboards, pointing devices and screens is hardly hands-free, nor is the size and expense of a multimedia computer conducive to widespread use and transportability. In its present state, voice mail is limited to short messages between individuals wishing to communicate in a more substantive fashion at another time (telephone tag) . Voice "mail" becomes limited to voice "messaging" because of the cost and inconvenience to both the sender and receiver of listening to lengthy, content-rich "mail" over the phone or at a multimedia computer" . Furthermore, the cost of transmitting audio signals in real-time, through a direct communication link to the user's voice processor or TAD, and only when the user has access to a telephone (as opposed to un-attended recording at off-peak hours) make more commercial use of info text (recorded instructions, recorded travelogues, speech transcripts, article or books on "tape" etc.) and other innovative advertiser/subscriber supported uses of voice-text unfeasible . Recently, U.S. Pat. No. 5,444,768, issued to Charles Lamer et al . , and assigned to International Business Machines Corp., and U.S. Pat. No. 5,359,698, issued to Shmuel Goldberg et al . and assigned to Espro Engineering both disclose a portable computer device for audible processing of audio messages stored at one or more remote central message facilities. The Lamer et al. system permits the user to record and playback, transmit (upload) and receive (download) voice messages from a central message facility and over a communication link and onto a portable device; however, the Lamer et al . system requires that a direct telephonic link be established between the portable device and one or more remote central message facilities. The Lamer et al . and Goldberg et al . systems enable the portable device to individually access a traditional, closed, expensive, proprietary voice processing system through a direct communication link. The Lamer et al . and Goldberg et al . systems do not provide a commercially feasible solution for accessing voice mail other than by way of a long distance call to a central message facility. The expense associated with such a long distance toll charge would make extended usage of the Lamer et al . system prohibitive. In addition, the Lamer et al . system requires that a user contact one or more remote central message facilities to retrieve and transmit selected audio files. The inconvenience associated with such a polling procedure nullifies the convenience provided by the system.
Similarly, the Lamer et al . system does not provide for a method by which the user may browse available audio content nor for a method to select audio files from a menu for subsequent retrieval by the portable computer device. Similarly, the Lamer et al . system does not provide for a utility whereby the user may remotely access a central server linked to a network of servers to download control code, search a personal user group or public database for an address other than by way of initiating a dedicated "training" mode by either coupling the portable computer device directly to a computer or by way of detecting and recording DTMF tones generated locally by a standard touch-tone telephone device. Since a typical user's mail box utilities are handled on his network e-mail server and modified regularly in the course of his sending and receiving e-mail, such a dedicated training session for the portable computer device is impractical. Similarly, since new audio server platforms, utilities and compression schemes are being introduced regularly, there is a need for a dynamic and transparent method for updating both control codes and address books without the need for a dedicated training session . Broadly, it is an object of the present invention to provide an Internet-ready dictation and voice message recording/reviewing device and method which enable a user to compose and review voice mail off-line, from any location, while engaged in any activity, at a leisurely pace, without incurring telephone toll charges and whether a communication link is presently accessible or not.
It is also an object of the present invention to use a telephone link preferably to a local network access point primarily as a communications link for high speed transmission of pre-recorded material and control codes to facilitate that transmission, thereby limiting the use of a telephone or a multimedia computer and telephone line for voice messaging as a recording or playback device . It is also an object of the present invention to provide a protocol whereby pre-message handshaking occurs between a dictation and voice message recording/reviewing device and a network server to conform the digitized voice signal to one of the standard voice compression protocols and TCP/IP protocol stacks to facilitate a high speed transmission of voice messages over the network.
It is another object of the present invention to provide a portable and dedicated voice capable network
(Internet) access device which enables the user to record, edit and play audio files which may be transmitted and/or received over a public or private network.
It is also an object of the present invention to provide a portable access device and method which permit the owner of a specially modem-configured Telephone Answering Device (M-TAD) to access and download compressed voice message files directly from the TAD ' s digital memory onto a portable voice message record/playback device either by way of a direct _ cable connection to the TAD or by a telephone link.
Providing such a portable access device and method would permit TAD owners to encourage inbound callers to leave more robust and data-rich audio messages on their TAD as well as permit TAD owners to subscribe to audio content which could be regularly delivered to their TAD in compressed digital form and downloaded onto the present invention for play-back and review at a convenient time and place. This would also permit TAD owners, while away from their home or office to have their portable dictation and voice message recording/reviewing device establish a telephone link with their TAD and economically and automatically retrieve all stored messages and update all outgoing messages (e.g. general and caller specific greetings) , with all stored messages and outbound greetings being transmitted in digitized and compressed format. The invention provides a low cost, portable recording and playback dictation and voice message recording/reviewing device which permits the user to record, edit, play and review voice messages including audio-text, text-to-speech and other audio material which may be received from and subsequently transmitted to a remote host computer located on a public or private network over a communication link such as the public switched telephone system.
A preferred device contains its own rechargeable power source, integrated circuitry and control buttons to permit the localized recording, editing, storage, playback and transcription of audio signals through a built-in speaker, microphone or plug- in headset, foot pedal and removable memory card. The device also contains a standard RJ-11 telephone jack, modem chip set (or software) , or a removable PCMCIA connector to which a standard or wireless modem card could be connected; and a DTMF tone decoder to permit the transmission and control of audio signals to and from a host computer connected to a public or private network. The device contains circuitry which permits it to transmit and receive audio signals at a rate substantially faster than originally recorded.
A preferred device also contains a processor which _ includes the necessary terminal emulation to permit a network user to access a network directly from a local point of access, such as an Internet service provider's (ISP) point of access and shell account, using a standard protocol such as SMTP
(Simple Mail Transfer Protocol), Post Office Protocol (P0P3) and MIME (Multipurpose Internet Mail Extensions) in the TCP/IP suite to review, select and retrieve audio files that have been sent to the user's e-mail address (or similarly, data/text files which can be translated into voice) , and to download and transmit such files. A preferred device also contains a standard or touchscreen display and software which permits the user to display a similar graphical editor for composing and reading e-mail messages as is displayed on his computer screen when accessing his e-mail, so that the user can scroll through his e-mail messages, selecting those audio files he wishes to download and selecting text messages he wishes to have converted, either by the network server or at the device, into an audio format (text-to-speech) .
A preferred device also contains : a cradle into which the device may be placed, the cradle having ports which enable it to be connected to a power source to recharge the device ' s batteries; a phone jack to enable it to establish a communication link; and a serial or parallel port on a computer for downloading and uploading files directly to the computer or for receiving "redirected" files.
A preferred device also contains a language user interface capable of recognizing and responding to speech with speech. Such an interface includes speaker independent functions but also permits speaker adaptation which allows the personal device to adjust to the peculiarities of the user's voice or pronunciations and thus improve accuracy. This speaker adaptation is achieved through a protocol which allows the system to adapt to the users voice through the repetition of a set of sentences prior to first use of the device (See Lernout & Hauspie Speech Product's [LHSP] asrlOOO product line) . The language interface includes a vocabulary builder which permits the user to extend the vocabulary including _ special terms and proper nouns to the speech recognition application (see LHSP Lextool™) , a user template which enables the user to create words which the device will associate with user defined commands e.g. "home" could be associated with an e-mail address (LHSP asr 200 product line) , alphabet recognition for spelling an e-mail address as well as background noise tolerance and speech at a distance software which improve the accuracy of the language user interface even in an automobile, airplane or public place and even if the user is not wearing a headset. (see LHSP) A preferred device also contains public-key encryption technology designed to ensure reliable and secure transmission of sensitive information by encrypting and decrypting the message data and by authenticating the sender's identity by using a secure digital or voice signature. A preferred device also contains a text-to-speech utility which permits the user to download data not already converted to speech by a network server and to do so at the device .
A preferred device also contains a bar code reader which permits the user to scan a printed bar code associated with printed matter such as a news article, a map, a menu of available audio files or in a travel guide which would give the device all the information it needs including network server address, file location and file ID so that the audio file associated with the printed matter could be automatically retrieved from a network such as the Internet.
A preferred device also contains a bar code reader which permits the user to scan a printed bar code associated with printed matter such as a news article, a map, a menu of available audio files or in a travel guide which would give the device all the information it needs to play a file from a previously retrieved group of audio files (such as described in Goldberg et al . ) . A preferred device also contains an Infrared interface using a standard such as the Infrared Data Association (IrDA) for high speed local wireless transmission (e.g. 1.2 Mbps and 4Mbps) of audio files and control codes between the device and a public phone, kiosk or the users' computer.
A preferred device also includes a software utility called an off-line browser which programs the device to automatically retrieve audio files from the network during off- peak hours to which the user has subscribed, or from selected Web sites which have new audio material available, or from e-mail addresses that the user has programmed the off-line browser to retrieve. A preferred device also includes a software utility which enables the user, by way of a graphical screen based interface or by way of audio prompts, to browse either network databases such as those located on the Internet for addresses and/or sites from which to receive and send audio files. A preferred device also includes a software utility which creates a graphic interface and memory for the user to access, refresh and/or download his E-mail address book containing the E-mail addresses of individuals and groups for which he may wish to prepare and to which he may wish to send audio files. Such a utility would automatically synchronize the data in the dictation and voice message recording/reviewing device to the data contained in the user's E-mail server account .
A preferred device also includes a software utility which creates a graphic interface and memory for the user to organize his/her telephone numbers, E-mail addresses, calendar, reminders and appointments including a clock and alarm function with an option to choose between a simple audible sound alarm or a programmed voice message alarm (e.g. "call home") . A preferred device also includes a software utility which enables the user to download proprietary client server software systems and upgrades and newly introduced standards for low bit rate speech compression made available over a public or private network such as the Internet to insure that the device may use the latest state-of-the-art audio compression software.
A preferred device also includes a software utility which enables the user to download proprietary client server software systems and upgrades and newly introduced standards which enable the device to receive highly compressed and/or streaming audio files containing voice content including, but not limited to application program interfaces (APIs) which enable the device to be used as a portable Internet Phone appliance to conduct a real-time, two-way, full-duplex voice conversation using a local connection to the Internet.
A preferred device also includes a software utility which extends the functionality of a Web program run from a Web browser and operate on data such as audio data as it flows in the user's PC, permitting the user to redirect audio files by the communication port directly to the device seated in a cradle and connected to the serial or parallel port . Alternatively, this could be achieved though OLE (Object Linking and Embedding) enabled Web software which when activated by the user by pressing a designated key such as print, redirects audio files directly to a special "printer" driver dedicated for the device. The utility permits users who are browsing the Web on their computers to download audio files directly to their personal audio servers for later access, without having to transfer from their hard disc.
A preferred device also includes a software utility which enables the user to select E-mail messages and request that the messages be converted from text-to-speech by an appropriate text-to-speech conversion application available to the network, and only subsequently digitized and transmitted as digitized and compressed audio file.
The invention also relates to a method and software utility using DSVD (Digital Simultaneous Voice/Data) and/or the VoiceView protocols (Radish Communications Systems, Inc.) which would enable the user, once connected to a communication link to be able to transfer and receive audio files directly into a dictation and voice message recorder Device simultaneously _ ,_
16 or, alternatively, with the user processing, and/or receiving and transmitting other related or unrelated data to and from the network or conversely, while the user is talking on the phone. The use of these voice/data protocols would permit the - dictation and voice message recording/reviewing device user to request audio files in response to voice prompts spoken in digitized streaming or analog voice, to respond by spoken responses, keypad entries or DTMF tones and to transfer those files in high speed data mode during the same phone connection. The invention also relates to a method and software utility which permit the scalability of digitized audio files in order to conform with network server requirements and or user preferences. This would enable the server to demand or the user to request a lower compression rate or slower transmission speed in order to have higher fidelity for the audio file requested, and vice versa.
It is a feature of the present invention that a recording device may be left connected to a communication link and programmed to dial into and to connect to a local network access point at off-peak hours when telephone rates are lowest and when excess capacity on incoming lines is available. The recording device is programmed to search the network for audio files to which the user has a subscription, new audio files available from Web sites to which the user has programmed the device to look, and for audio mail sent to the user from selected E-mail addresses.
It is a feature of the present invention that an interface port, such as a standard RJ-11 telephone jack, is provided so that the recording device may be connected between a telephone set, computer, cellular phone or personal digital assistant and a communication link to enable the user to select and retrieve voice files while using any of the above devices. It is also a feature of the present invention that circuitry is provided for the digital conversion and compression of the analog voice signals recorded in the memory of a dictation and voice message recording/reviewing device to permit high density storage and high speed transmission of digitized voice. Similarly, circuitry is provided for the analog conversion and natural sounding playback of previously stored or received digitized voice.
It is also a feature of the present invention that there may be provided a public terminal e.g. in a manner similar to an automated teller machine and located at places such as airports and tourist sites where a user could connect his recording/reviewing device and select voice messages and audio- text to be retrieved and transmitted directly by the recording/reviewing device.
Brief Description of the Drawing
The foregoing, as well as the other objects, features and advantages of the present invention will be understood more completely from the following detailed description of a preferred embodiment, with reference being had to the accompanying drawing, in which:
Figure 1 is a schematic block diagram of a preferred personal audio message processor embodying the present invention; and
Figures 2-7 (Figure 2 comprises Figures 2a and 2b) are flow charts illustrating how certain processing is performed in the apparatus of Fig. 1.
Detailed Description
Figure 1 is a schematic block diagram of a presently preferred Personal Voice Server (PVS) system 10 embodying the present invention. PVS system 10 broadly comprises five main parts: a highly integrated DSP/RISC integrated chip 11 (DSP stands for Digital Signal Processor and RISC stands for Reduced Instruction Set Computer); a Telecom/Audio Codec 17; a memory such as SDRAM 12 and/or Flash Memory 13 coupled to the DSP chip; peripherals such as a microphone 26, a speaker 18, a touchscreen/display LCD 19, an infrared I/O 21 and a Barcode reader 15. Operating system software is also provided to manage the DSP to handle modem routines such as V32bis, V34 etc., voice recognition, echo cancellation and speech synthesis; software also controls the system via the RISC part of chip 11. Although the embodying device 10 is referred to ^ as a voice server, it should be clear that it is equally useful for other types of audio, including music.
The DSP chip is preferably a Philips Semiconductor PR31100 chip, which contains a MIPS R3000 RISC CPU core with 4 Kbytes of instruction cache and 1 Kyte of data cache, plus various integrated functions for interfacing to numerous system components and external i/o modules. The chip also has a hardware multiply/accumulate unit to perform DSP functions, such as a software fax/modem which eliminates the need for an external modem chip set . However the chip also has a UART (Universal Asynchronous Receive Transmit) interface 22 (shown separately) , which permits the device to be connected to an external modem or other device (such as a modem equipped Telephone Answering Device) through a conventional RS232 serial connector 23. The PR31100 also contains multiple DMA (direct memory access) channels and a high-performance, flexible Bus Interface Unit (BIU) for providing an efficient means for transferring data between external system memory, cache memory, the CPU core, and external I/O modules. The PR31100 also contains a System Interface Module (SIM) , which provides integrated functions for interfacing to various external I/O modules, such as a liquid crystal display (LCD) 19, an infrared I/O module 21, and the Codec 17.
Codec 17 is preferably a Philips UCB 1100 single chip integrated mixed signal audio and telecom codec, which handles most of the analog functions of the system, including the sound and telecommunications codec (analog/digital coding and decoding) functions and touchscreen analog-to-digital conversion, ISDN/high-speed serial, infrared, and wireless peripherals. The high-speed serial interface 14, although shown separately in Fig. 1, is actually part of the UCB1100. The chip has a single channel audio codec which is designed for direct connection of a microphone and speaker (i.e. components 16 and 28 are actually part of the UCB1100) . The built-in telecommunications codec can be connected directly to a conventional RJ-ll jack 20 for connections to a telephone line. For a more complete understanding of the embodiment of Fig. 1, data sheets for the PR31100 and UCB1100 are attached and are incorporated in this description by reference.
The operating systems software for the PR31100 is preferably Eden OS version 2.0, commercially available from the Eden Group Limited of Cheshire, England. This operating system is specifically designed to support the PR31100 (also known as DINO) and the UCB1100 (also known as BETTY) . A data sheet for the Eden OS is attached, which describes the software support and the drivers provided by the operating system. This data sheet is incorporated in the present description by reference. Memory 12, 13 is used to store messages and to hold temporary data. The flash memory is configured according to the amount of permanent programs required, including operating system (O/S) and application software and also to store some of the recorded messages. Typically, audio compression provided in the PR31100 will result in a data bandwidth of less than half a Kbyte per second (i.e. 1Mbyte of memory will provide an hour of audio . )
A microphone 26 and speaker 18 are selected based on quality and size. Flow diagrams are presented in Figs. 2 - 7 to describe the operation of retrieving messages over the Internet and transmitting them to and from the PVS as well as the various operational options for dialing, receiving data from a given server address in the Internet, storing, screening, retrieving, transmitting and playing messages to/from the PVS.
These operations include receiving compressed messages in digital form and audio signals in analog form bi-directionally from speaker/microphone and phone connection.
Figures 2a and 2b comprise a flow chart illustrating how the PVS connects to a location on the Internet by Transport Protocol and how the PVS gets all data relating to its Web/email site (e.g. HTML language displaying information) and receives/stores messages (audio, data etc.) that were sent using either a proprietary or de-facto standard (e.g. highly compressed audio at 2.5 kbps) .
The operation depicted in Figs. 2a and 2b is run concurrently by the real-time kernel of the DSP/RISC (discussed further below with reference to Fig. 3) . It enables multiple tasks to be run and executed in Parallel. Operation of the main task begins at block 200. Accessing a site and storing or receiving stored messages is run concurrently with other tasks. These tasks can be local to operate the PVS, or other tasks such as the operation of the bar-code reader, voice synthesizer, voice recognition, or to access other Web sites by PPP at the same time.
At block 202, a test is performed to determine whether the desired operation is connection to a network access provider via an out-bound call (at block 210) . If not, the modem, in response to a ring, answers the call, completes its handshake procedure, and begins receiving information (block 204) . Data bits from the modem are received by DSP chip 11 at block 220. The DSP chip decodes the incoming data at block 230.
At block 240, a test is performed to determine whether the desired operation is to decode an HTML site. If not, control transfers to block 340. Otherwise operation continues at block 250, where the display of the site page begins. A test is performed at block 260 to determine whether the mode of operation is interactive or automatic. In the interactive mode, the user of the PVS has to browse and select the desired operation to be completed. In automatic mode, the keyword (s) to retrieve audio or other messages are searched for and activated automatically to get the compressed data. If the test at block 260 senses the interactive mode, control is transferred to block 110 in Fig. 2b. If not, automatic browsing is done starting at block 270 to search for a highlighted keyword symbol. At block 280, a test is performed to determine whether the keyword constitutes a request for a previously digitized message and if so, the data compressed by FTP protocol is received by the PVS at block 290. If the test at block 280 results in a "no", control transfers to the block 310.
At block 310 a test is performed to determine whether no more messages exist, and, if so, control returns to block 100. Otherwise, a test occurs at block 320 to determine if the keyword constitutes a request for a place to store local messages at the web server. If so, this data, such as a compressed audio messages, is transmitted from the PVS to the web site (block 330) . If not, control returns to the start (block 100) . The process is continued until there are no other stored messages for the PVS owner at this Web site.
At block 340, a test is performed to determine whether this cite is utilizing the FTP protocol language. If so, a message is retrieved utilizing FTP (block 360) , and it is stored at block 380 and control is transferred to block 120 in Fig. 2b. If it is determined at block 340 that FTP protocol is not being used, a test is performed at block 340 to determine whether or not a recognized access language is being received. If so, a message is retrieved at block 360 utilizing the recognized access language and is then stored at block 380. Control is then transferred to block 120 in Fig. 2b. If a recognized access language is not found at block 350, the user is notified at block 370 and control returns to block 100.
If it was determined at block 260 that the mode is interactive, control is transferred to block 110 in Fig. 2b. At block 112, the keywords in the web page are selected and, at block 114 HTML interpretation is activated to locate the messages in the pool. At block 116, messages are then sent and/or received and control is returned to block 100 in Fig. 2a.
Following block 380, where data was stored, preferably in compressed form, control is transferred to block 120 in Fig. 2b. Any data which is stored causes the creation of data in a flat database (block 120) , which may be searched to locate the data at a later time. In case the message is an audio message, it is decompressed and played at the same time that it is transmitted by FTP protocol. The test at block 122 determines whether such action is necessary for the current message and, if so, decompression and the audio synthesizer are activated (block 124), the database is updated to reflect that the message is ready to be synthesized, and control is returned to block 100. If the message is not to be decompressed and played, block 122 transfers control to block 128, where a test is performed to determine if the message is to be sent to the web server and, if not, control is returned to block 100. If the message is to be sent to the web server, it is sent by FTP at block 130, and the user is notified upon completion of the transfer (block 132) , after which control returns to block 100.
Figure 3 describes the overall operation of the
Kernel of the Eden OS as run on the RISC core CPU of the DSP
11 for the present application. The kernel is multitasking, in that it can run multiple programs or tasks concurrently, with each one having its own priority and being capable of initiating other (child) tasks. After the Kernel initializes via blocks 400-420, operation starts at the idle mode at block 480, where the PVS waits for events to occur, and when one occurs it is handled at block 430. Every program interacts with the operating system this way, by having its tasks attended to at block 430. The type of events that arise are either synchronous or asynchronous. At block 440, if a synchronous event is detected, processing of the synchronous events is initiated via connector 5. Otherwise, a test is performed at block 450 to detect an asynchronous event, in which case processing of the asynchronous events is initiated via connector 6. In each case, after processing is initiated the operating system returns to the idle mode to process other events. Another special event to occur is error handling at block 460. In the event that an asynchronous event is not detected at block 450, a test is performed at block 460 to detect a failure event and if there is none, the program returns to the idle mode. In the event of a hardware failure, a communications failure or a software failure, an error event is detected at block 460 and a run time handler is issued (block 470) and handles the event. Control then returns to the idle mode. The synchronous and asynchronous events identified in Fig. 3 are only exemplary and it is contemplated that there may be others of each type.
Figure 4 is a block diagram illustrating the routine performed by the controller of DSP/RISC Chip 11 when an analog _ audio message is to be recorded. At block 710, a test is performed to determine whether the incoming messages are from the built-in microphone. If not, control is transferred to the routine of Fig. 5. If so, the audio message is digitized and compressed (block 720) and placed in the working pool of data (block 730) . At block 740, a test is performed to determine whether memory was filled before an entire message was stored. If not, the routine is terminated, and control returns to the idle mode. If so, recording is disable (block 750) , and the operator is notified, as by warning light, that the memory is full (block 760) . Control reverts to the idle mode.
Figure 5 is a block diagram illustrating the routine performed to record analog audio from the telephone line . At block 800, a test is performed to determine whether an audio message being received is from the communications link (telephone line) . If not control is transferred to the routine of Fig. 6. If so , the message is passed through the Telecom/Audio Codec 17 as audio (block 810) , and a test is performed at block 820 to determine whether compression is to be performed by the DSP/RISC Chip. If so, the message is stored in local memory (block 830) , recording is stopped, and control is returned to the idle mode. If compression is not to be performed by the DSP/RISC Chip^. the message is sent to the Telecom/Audio Codec, which compresses it by a standard (ADPCM) algorithm (block 840) . The message is then sent back to the DSP/RISC 11 through its UART (block 850) , and the DSP/RISC chip control that causes the message to be stored I flash memory 13 (block 860) . Control is then returned to the idle mode .
Figure 6 is a block diagram of the routine performed by the Audio/Telecom Codec controller to play stored audio through the built-in speaker. At block 900, the operator selects a message from the pool of messages stored in the device. At block 910, a test is performed to determine whether stored message to be read was originally compressed by the audio/telecom codec. If not, control is transferred to block 920. If so, the message is read and decompressed using the audio/telecom codec (block 930) , and the decompressed message is applied to the digital-to-analog converter (DAC) in the audio/telecom codec (block 940) . The message is the played via the built-in speaker 18 through the D/A converter and amplifier 28 (block 950) , and control is returned to the idle mode.
If the stored message was not originally compressed by the audio/telecom codec, a test is performed at block 920 to determine whether the stored message was originally compressed by the audio/telecom codec. If not, the user is notified (block 960) , and control is returned to the idle mode. If so, the message is read by the controller (block 970), and it is then sent to the modem to be decompressed and then returned from the modem to memory 13 through the UART port of the audio/telecom codec 17 (block 980) . Control is transferred to block 940, and playback is handled in the same manner as a message originally compressed by the Audio/telecom codec. Figure 7 is a schematic illustration of how the PVS, connected to its cradle may be connected to a PC (whether multimedia or not) or to a specially configured TAD with a built-in modem in order to permit a PC or TAD user (A) to send or receive a voice file from or to the PVS through a modem other than the telecom/audio codec of the PVS. This would permit a PC user to send or attach a voice file resident in the PVS over the PC's modem and would likewise permit the PC user to download a voice file received over the PC's modem directly to the PVS. The same configuration would permit a non- multimedia PC user (B) to play audio files by using the PVS ' s multimedia capabilities to play audio files received over the non-multimedia PC's modem. This configuration would likewise permit the PC user (C) to record audio through the PVS ' s built- in microphone and transmit it through the PC's modem as files or streaming audio. Such a configuration would also permit the user of a PC (D) to redirect audio files directly to the PVS while using a standard Web browser program. Finally, a similar configuration with a modem configured TAD would permit the TAD user to download audio messages to and from the TAD to the PVS. Bi-directional communication from the PC to the PVS is handled by a communication cable (e.g. 9 pin connector) at the PC and the serial RS232 port on the PVS and controlled by the asynchronous event software controlling input/output from the UART communication interface .
The software at the PC handles the driver for sending/receiving data to/from the PC to the PVS. For sending data, this would be similar to a PC sending data to a fax or printer, and for receiving data, this is similar to a PC receiving data from a scanner. This driver sets all required parameters for the PVS such as type of operation, length and wait for acknowledgment and "End of Transmission" . The PC also handles the software to use the PVS as an attachment (peripheral) for receiving multimedia audio messages so that the speaker on the PVS will operate. The PC also handles the software to manage the microphone input of the PVS, and software to integrate with a standard Web Browsers (e.g. Netscape Navigator) to be fully integrated with the software and invoke commands to the PVS accordingly.
The software in the PVS is part of the multitasking operating functions to handle Remote activation of Procedural Calls (RPC) controlled under the asynchronous events software of the PVS.
Although preferred embodiments of the invention have been disclosed for illustrative purposes, those skilled in the art will appreciate that many additions, modifications and substitutions are possible without departing from the scope and spirit of the invention as defined in the accompanying claims.
25 1
Philips Semiconductors ^W^^W*"^"
Highly integrated embedded processor MIPS PR31100
Version 1.2
GENERAL DESCRIPTION FEATURES
PR31100 Processor is a single-chip, low-cost, integrated embedded • 32-bιt R3000 RISC static CMOS CPU processor consisting of MIPS R3000 core and system support logic to interface with various types of devices. • 4 KByte instruction cache
PR31100 consists of a MIPS R3000 RISC CPU with 4 KBytes of • 1 KByte data cache instruction cache memory and 1 KByte of data cache memory, plus
• Multiply/accumulator integrated functions for interfacing to numerous system components and external I O modules. The R3000 RISC CPU is also augmented • On-chip peripherals with individual power-down with a multiply/accumulate module to allow integrated DSP - Multi-channel DMA controller functions, such as a software modem for high-performance standard data and fax protocols. PR31100 also contains multiple DMA - Bus interface unit channels and a high-performance and flexible Bus Interface Unit - Memory controller for ROM, Flash, RAM, DRAM, SDRAM, (BIU) for providing an efficient means for transf e ng data between SRAM, and PCMCIA and or MagicCard external system memory, cache memory, the CPU core, and - Power management module external I O modules. The types of external memory devices - Video module supported Include dynamic random access memory (DRAM), - Real-time dock 32.760KHz reference ynchronous dynamic random access memory (SDRAM), static random access memory (SRAM), Rash memory, read-only memory - High-speed serial interface (ROM), and expansion cards (PCMCIA and/or MaglcCard). - Infrared module PR31100 also contains a System interface Module (SIM) containing - Dual-UART Integrated functions for interfacing to numerous external I/O - SPI bus modules such as liquid crystal displays (LCDs), the UCB1100 (which handles most of the analog functions of the system, including sound • 3.3V supply voltage and telecom codecs and touchscreen ADC), ISDN/high-speed • 208-pm LQFP (Low profile quad flat pack) serial, infrared, wireless peripherals, Magicbus, etc. Lastty, PR31100 contains support for implementation of power management, • 40MHz operation frequency whereby various PR31100 internal modules and external subsystems can be individually (under software control) powered up and down.
Figure 1 shows an External Block Diagram of PR31100.
ORDERING INFORMATION
Figure imgf000028_0001
1996 Aug 07 Philips Semiconductors
Highly integrated embedded processor MIPS PR31100
Figure imgf000029_0001
Figure 1. PR31100 Block Diagram
1996Aug07
S R Philips Semiconductors
Highly integrated embedded processor MIPS PR31100
OVERVIEW - separate read and wnte protection control for kernel and user
Each ol the on-chip peripherals consist of space
- 8 total protectable regions available each individually
BlU Module programmable, using breakpoint address, mask, control, and
* System memory and PR31100 Bus Interface Unit (BlU) status registers
- supports up to 2 banks of physical memory - causes address exception on illegal reads or writes
- supports self-refreshing DRAM and SDRAM • high-speed multiplier/accumulator
- programmable parameters for each bank of DRAM or SDRAM - on-chip hardware multiplier (row/column address configuration, refresh, burst modes, etc ) - supports 16x16 or 32x32 multiplier operations, with 64-bit
* programmable chip select memory access accumulator
- 4 programmable (size, wait states, burst mode control) memory - existing multiply instructions are enhanced and new multiply device and general purpose chip selects and add instructions are added to R3000 instruction set to improve the performance of DSP applications available for system ROM, SRAM, Rash available for external port expansion registers • CPU interface
- 4 programmable (wait slates, burst mode control) MagicCard or - handles data bus, address bus, and control interface between general purpose chip selects CPU core and rest of PR31 00 logic available for (future) MagicCard expansion memory PR31100 provides the chip select ana card detect signals Clock Module supports card insertion/removal timeouts • PR31100 supports system-wide single crystal configuration, MagicCard requires minimal number of unique control status besides the 32 KHz RTC XTAL (reduces cost, power, and board space) signals per port
• common crystal rate divided to generate clock for CPU, video,
* supports up to 2 identical full PCMCIA ports sound, telecom, UARTs, etc.
- PR31100 and UCB1100 provide the control signals and accepts the status signals which conform to the PCMCIA version 2 01 • external system crystal rate is vendor-dependent standard • independent enabling or disabling of individual clocks under
- appropriate connector keying and level-shifting buffers required software control, for power management for 3.3V versus SV PCMCIA interface implementations
CHI Module
SIU Module • high-speed senal Concentration Highway Interface (CHI) contains
* multi-channel 32 -tut DMA controller and System Interface Unit logic for interfacing to external full-duplex senal (SIU) time-drvwion-multipiexed (TDM) communication peripherals
* independent DMA channels for video, Magicbus, SIB to/from • supports ISDN line interface chips and other PCM TDM senal UCB1100 audio telecom codecs, high-speed senal port, IR UART, devices and general purpose UART • CHI Interface is programmable (number of channels, frame rate,
* address decoding for submodules within System Interface Module bit rate, etc.) lo provide support for a vaπety of formats (SIM)
• supports data rates up to 4096 Mbps
CPU Module • independent DMA support for CHI receive and transmit
* R3000 RISC central processing unit core Interrupt Module
- full 32-txt operation (registers, instructions, addresses)
• contains logic for individually enabling, reading, and cteaπng all
- 32 general purpose 32-brt registers 32— it program counter PR31100 interrupt sources
- MIPS RISC Instruction Set Architecture (ISA) supported
• interrupts generated from internal PR31100 modules or from edge
* on-chip cache transitions on external signal pins
- 4 KByte direct-mapped instruction cache (l-cache) physical address tag and valid bit per cache line IO Module programmable burst size • contains support for reading and writing the 7 bi-directional instruction streaming mode supported general purpose IO pins and the 32 bi-directional multi-function
- 1 KByte data cache (D-cache) IO pins physical address tag and valid bit per cache line • each IO port can generate a separate positive and negative edge programmable burst size interrupt wnte-th rough
• independently configurable IO ports allow PR31100 to support a
- cache address snoop mode supported for DMA flexible and wide range of system applications and configurations
- 4-level deep write buffer
* programmable memory protection
1996 Aug 07 Philips Semiconductors . Ereliminarysoeciiicajuoπ
Highly integrated embedded processor
Figure imgf000031_0001
IR Module • independent DMA support for audio receive and transmit, telecom
• IR consumer mode receive and transmit
- allows control of consumer electronic devices such as stereos, • supports 8— bit or 16— bit mono telecom formats TVs, VCRs, etc.
• supports 8-tπt or 16-bιt mono or stereo audio formats
- programmable pulse parameters
- external analog LED circuitry • independently programmable auαio and telecom sample rates
• IRDA communication mode • CPU read/wnte registers for subframe control and status
- allows communication with other IRDA devices such as FAX machines, copiers, pnnters, etc. System Peripheral Interface (SPI) Module
- supported by UART module within PR31100 • provides interface to SPI peπpherals and devices
- external analog receiver preamp and LED circuitry
• full-duplex, synchronous senal data transfers (data in, data out,
- data rate ■ up to 115 Kbps at 1 meter and dock signals)
• IR FSK communication mode • PR31100 supplies dedicated chip select and interrupt for an SPI
- supported by UART module within PR31 00 interface senal power supply
- external analog IR chlp(s) perform frequency modulation to
• β-brl or 16-bιt data word lengths for the SPI interface generate the desired IR communication mode protocol
- data rate -= up to 36000 bps at 3 meters • programmable SPI baud rate
• earner detect state machine Timer Module
- peπodicalry enables IR receiver to check if a valid earner is
• Real Time Clock (RTC) and Timer present
• 40-bit counter (30.517 μsec granularity),
Magicbus Module maximum uninterrupted time « 388.36 days
• synchronous, senal 2-wιre (clock and data), half-duplex • 40-bιt alarm register (30 517 μsec granulaπty) communications protocol
• 16-bιt periodic timer (0868 μsec granulaπty),
• supports low-cost, low-power peπpherals maximum timeout = 56 θ msec
• supports maximum data rate of 14 75 Mbps • interrupts on alarm, timer, and pπor to RTC roll-over
• DMA support for Magicbus receive and transmit
UART Module
Power Module • 2 independent full-duplex UAFTTs
* power-down modes for individual internal penpheral modules • programmable baud rate generator * senal (SPI port) power supply control interface supported • UART-A port used for senal control interface to external IR
• power management state machine has 4 states RUNNING module DOZING, SLEEP, and COMA • UART-B port used for general purpose senal control interface
Serial Interconnect Bus (SIB) Module • UART-A and UART-B DMA support for receive and transmit
• PR31100 contains holding and shift registers to support the senal interface to the UCB1100 and/or other optional codec devices Video Module
• bit-mapped graphics
• interface compatible with slave mode 3 of Crystal CS4216 codec
• supports monochrome grey scale, or color modes
• synchronous, frame-oased protocol
• time-based dithenng algoπthm for grey scale and color modes
• PR31100 always master source of clock and frame frequency and phase, programmable clock frequency • supports multiple screen sizes
• each SIB frame consists of 128 clock cycles, further divided into 2 • supports split and non-split displays subframes or words of 64 bits each (supports up to 2 devices
• vanable size and relocatable video buffer simultaneously)
• DMA support for fetching image data from video buffer
1996 Au 07 Philips Semiconductors
Highly integrated embedded processor MIPS PR31100
Figure 2 shows a typical system block diagram cosisting of PR31100 and UCB1100 for a total system solution.
Figure imgf000032_0001
Figure 2. System Block Diagram
1996 Aug 07 Philips Semiconductors 'mϋπtøiAv MetfflMTion
Advanced modem/audio analog front-end UCB1100
Version 1.2
GENERAL DESCRIPTION TABLE OF CONTENTS
The UCB1100 is a single chip, integrated mixed signal audio and GENERAL DESCRIPTION 2 telecom codec. The single channel audio codec is designed for APPLICATIONS 2 direct connection of a microphone and speaker. The built-in telecom KEY FEATURES 2 codec can directly be connected to a DAA and supports high speed TABLE OF CONTENTS 2 modem protocols. The incorporated 10 bit analogue to digital 1.0 FUNCTIONAL BLOCK DIAGRAM 3
2.0 ORDERING INFORMATION 4 converter and the touch screen interface provides complete control
3.0 ABSOLUTE MAXIMUM RATINGS 4 and readout of a connected 4 wire resistive touch screen. The 10 4.0 DC ELECTRICAL CHARACTERISTICS 5 additional general purpose I O pins provides programmable inputs 5.0 PINOUT 6 and or outputs to the system. 5.1 PINLIST 7
The UCB1100 has a serial interface bus (SIB) intended to 6.0 FUNCTIONAL DESCRIPTION 8 communicate to the system controller. Both the codec input and 6.1 AUDIO CODEC 8 output data and the control register data is multiplexed on this SIB 6.1.1 AUDIO INPUT SPECIFICATIONS 10
6.1.2 AUDIO OUTPUT SPECIFICATIONS 11 interlace. 6.2 TELECOM CODEC 12
6-2.1 TELECOM INPUT SPECIFICATIONS 14
APPLICATIONS 6.2.2 TELECOM OUTPUT SPECIFICATIONS . .. 15
6.3 TOUCH SCREEN MEASUREMENT MODES . . . . 16
' Personal Intelligent Communicators (PIC)/ 6.3.1 POSITION MEASUREMENT 16 Personal Digital Assistants (PDA) 6.3.2 PRESSURE MEASUREMENT 16
• Screen phones 6.3.3 PLATE RESISTANCE MEASUREMENT . .. 16
6.4 TOUCH SCREEN INTERFACE 17
• Smart Phone and smart Fax 6.4.1 TOUCH SCREEN SPECIFICATIONS 18
6.5 10 BIT ADC 19
• Intelligent Communicators 6.5.1 SPECIFICATION OVERVIEW 21
6.6 ON CHIP REFERENCE CIRCUIT 21
6.6.1 SPECIFICATION OVERVIEW 21
KEY FEATURES 6.7 SERIAL INTERFACE BUS 22
• 48-pin LQFP (SOT313-2) small body SMD package and low 6.7.1 SIB DATA FORMAT 23 external component count result in minimal PCB space 6.7.2 CODEC DATA TRANSFER 24 requirement. 6.7.3 CONTROL REGISTER DATA TRANSFER . 26
6.7.4 AC ELECTRICAL CHARACTERISTICS . .. 27
• A 12-bit stgma delta audio codec with programmable sample rate, 6.8 GENERAL PURPOSE l/Os 27 input and output voltage levels, capable of connecting directly to 6.9 INTERRUPT GENERATION 27 speaker and microphone, including digitally controlled mute, 6.10 RESET CIRCUITRY 28 loopback and clip detection functions 7.0 MISCELLANEOUS 29
7.1 POWER ROUTING STRATEGY 29
• A 14-bit sigma delta telecom codec with programmable sample 8.0 CONTROL REGISTER OVERVIEW 30 rate, including digitally controlled input voltage level, mute, 9.0 PACKAGE OUTLINES 34 loopback and dip detection functions. The telecom codec is 9.1 PACKAGE OUTLINE LQFP48 34 intended for direct connection to a DAA (digital access 10.0 DEFINITIONS 36 arrangement) and includes a built-in sidetone suppression circuit.
• A complete 4 wire resistive touch screen interface circuit supporting position, pressure and plate resistance measurements.
• A 10-bit successive approximation ADC with internal track and hold drcuit and analogue multiplier for touch screen readout and monitoring of four external high voltage (7.5V) analogue voltages.
• A high speed, 4 wire serial interface data bus (SIB) for communication to system controller.
• A 3.3V supply voltage and built in power saving modes make the UCB1100 optimal for portable and battery powered applications.
1996 Apr 09 Philips Semiconductors ''j-niruirv ftrv»rififtalion
Advanced modem/audio analog front-end UCB1100
1.0 FUNCTIONAL BLOCK DIAGRAM
Figure imgf000034_0002
Figure imgf000034_0001
SH00126
Figure 1. Block Diagram of the UCB1100
1996 Apr 09 25 / 8
Eden OS V.2.0 Overview
EROS (Eden Real-time Operating System) is a full-featured operating system designed from scratch to be:
Compact: the operating system consumes resources in the form of ROM and RAM in the product, these resources add to the BOM costs of the product and any space occupied by the OS must be justified. EROS is designed to be small. The modularity is also a feature which supports the compactness of the operating system; where individual products do not need a feature it can be omitted or replaced by some subset, leaving more room for the visible components that add features and thus perceived value.
Open: an open OS will be more likely to attract 3ri party developers looking to design software products for sale, so allowing more value in the form of available features to be added to products based on the OS. EROS has a published API and a PC-based SDK which supports the development of applications in a readily available development platform.
Modular: Each component individually and in many cases sub-components may be omitted or replaced without difficulty where their functionality is not needed or has to be changed for particular products.
Portable: 99% written in ANSI C, porting to a new processor and/or tailoring to a specific product design is sufficiently simple and predictable for this to be completely acceptable within a product development lifecycle. EROS offers the same application interface on each platform, allowing applications to run on any EROS platform. EROS application development is carried out on a PC SDK incorporating a subset of the target OS. In the medium term, Eden will adopt the GNU toolset for the development of EROS itself and support this toolset for all targets.
The overall structure of EROS is shown in the enclosed slide.
© 1995/6, Eden Group Ltd, England Λll Rights Rescued ISDN PSTN
Figure imgf000036_0001
2 y
TASK
PROGRAM MANAGER MANAGER
00 5ι (LINK ON Λ LOADER)
H U RECOONISER ENGINES
Figure imgf000037_0001
The components of EROS are:
Advanced Real-time Kernel (ARK): This is the core of EROS; based on the ITRON 3 specification and extended, this supports pre-emptive, prioritised multitasking, message queues, semaphores, rendezvous ports, event flags and interrupt handling.
Virtual Memory Management (VMM): Depending on the level of support available within the chosen platform, this offers protection against faulty applications, mapping of virtual memory onto real memory and supplies the dynamic memory handling (mallocO and freeO).
Eden's Visual Environment (EVE): This offers an object oriented means of building up a GUI. EVE implements a core set of simple objects which do not impose a 'look and feel' on the OEM and application provider. EVE also supports a limited number of compound objects (as the name implies, constructed by joining simple objects together). Application writers can easily generate their own compound objects to implement the GUI they design.
Advanced Database Access Module (ADAM): This is a traditional database implementation, offering a record structure, insert, delete, search, data integrity checks and record locking. It differs from other database implementations by being designed to operate in an embedded environment.
Clipboard Application Interface (CAIN): The EROS clipboard supports copy, cut and paste and drag and drop. It does this by allowing applications to set-up self-describing data items which can then be passed between applications which have no knowledge of each other.
Generic Object Data System (GODS): EROS' file system is built as a number of layers, allowing multiple filing systems to be supported (typically a DOS- compatible filing system on PC-cards and a Flash-oriented for built in non-volatile storage) without the applications being aware of such details.
PC card services: EROS supports SRAM, Flash and ATA drives as storage and data exchange devices. The PC card services offer a key set of facilities allowing support for specific card types to be developed as necessary.
Device Handling: One of the features of embedded systems is that they often have non-standard devices and PC-cards supply loadable devices which may not be known at the time the system is first built. EROS' Device Manager supports the dynamic addition of device drivers and allows handler tasks to establish a connection to whichever is the most appropriate driver.
TCP/IP: EROS supports TCP/IP, SLIP and PPP. A number of higher levels protocols are supported as standard within the OS including UDP, FTP, SMTP,
© 1995/6, Eden Group Ltd, England. Λll Rights Resetted POP3, and HTTP. Other protocols are supported on a specific product or implementation basis.
Other features supported by EROS include:
Linking and Loading: Embedded systems are typically provided as a single ROM containing the operating system and all the applications. The addition of new applications and the correction of those supplied in ROM is difficult. Flash memory is used, but the mechanisms for upgrade and addition are usually clumsy. EROS makes use of a Dynamic Linker Loader (ELF) to overcome much of this difficulty. EROS itself and built-in applications are installed in ROM but their external linkage symbols are loaded into RAM during start-up. Patches can be installed so that later in the start-up sequence some of these symbols are changed to point to new code, thus avoiding the obselete areas of code in ROM. Similarly, applications which are loaded dynamically are linked to this symbol table and so use the correct built-in and patched code.
Localisation: The OS structure supports OEMs and application developers in providing a framework within which applications can be constructed which are easily ported from language to language and from country to country, with little or ideally no change to software.
Power Management: Embedded applications are often battery powered and hence power use is critical. While the degree of support offered by particular processors and products will vary, EROS supports an API which allows applications to be contructed in a power-sensitive manner and supports the specific attributes of particular platforms in an appropriate manner.
Application Interface: Any application program interacts with EROS through the Application Program Interface (API). At the prograrnming level these appear as function calls. These functions are primarily in the form of 'helpers' which execute as part of the application task and exchange information with one or more EROS tasks before returning to the application code. Responses and other input from EROS are provided by messages sent to the task's input queue or, for so-called 'blocking' calls, by the helper function using a 'rendezvous' for the exchange. Application tasks are usually structured as a single message handling loop which takes messages from a message queue.
© 1995/6. Eden Group Ltd, England. Λll Rights Reserved. 25/13
Development Tools: EROS includes a set of tools to enable applications to be developed for EROS platforms. Such applications will usually be platform (processor) and product independent, subject to appropriate devices being available to handle the interfaces. The toolset comprises:
• a sub-set of EROS which executes in DOS on a PC and provides an environment in which most applications can be developed and tested. This requires that the developer uses the Borland 4.5 development system.
• cross-compilers, linker and host-target debuggers are specific to the target platform; Eden will recommend these on a platform specific basis but in the medium term will primarily suggest and support the GNU tools.
• a terminal/target monitor program which allows internal details of EROS to be examined
• font and icon editors
• full linking instructions are provided to allow OEMs to build ROM images which include EROS and built-in applications
• full construction details are supplied to allow a patch file to be created
• full instructions are supplied to allow loadable programs to be produced
• EROS for the target platform is supplied in the form of shared libraries making up the 'helper' functions, object code for the EROS tasks and an initial startup sequence to be modified by or on behalf of the OEM.
© I995'6, Eden Group Ltd, England. Λll Rights
Figure imgf000040_0001
Target hardware and product-specific issues: A very large proportion of EROS is hardware and product independent, requiring simply re-compilation to run on a new platform. Thus the amount of effort required to tailor EROS to a specific processor and product configuration is relatively small.
The areas usually requiring rework on a per-platform (i.e. per-processor) basis are:
• basic serial port driving and monitor production.
• kernel mapping at the lowest level
• core start-up sequence
• memory mapping to use the target architecture
The primary areas where such work is usually necessary on a per-product basis are:
• keyboard, screen and digitiser handling: typically each product uses different hardware in these areas, EROS offers a simple interface to program to and Eden will do this work if required.
• memory configuration and start-up: EROS supplies a skeleton start-up sequence (above) for each target platform; extending this is a product-specific task.
• Non-standard devices: EROS has a device handling architecture which supports the addition of new device handlers.
• PC-card interfacing: Eden generally has to rework the lower levels of PCMCIA card handling to use the particular controller selected.
• The development version of EROS on the PC requires changes to match the screen size of the target product, to support GUI development.
© 1995/6, Eden Group Ltd, England. AU Rights Reserved

Claims

Claims:_ 1. A portable apparatus for communication of audio signals in analog and digital form and for storage of the same, comprising: digital storage means; a communication connection to a communication channel; a telecommunications interface having a communications input and output coupled to said communication connection and a digital input and output; an analog-to-digital converter having an output coupled to said storage means; and a controller coupled to said storage means and said telecommunications interface digital input and output and comprising: means for detecting whether a signal on said communication connection is an analog or digital audio signal; routing means controlled by said means for detecting and coupled to said telecommunications interface, said storage means and said analog-to-digital converter, upon said detecting means detecting a digital signal said routing means causing the digital output of said telecommunications interface to be coupled to said storage means, upon said detecting means detecting an analog signal said routing means causing said telecommunications interface to bypass the signal on said connection and coupling the
2. same to said analog-to-digital converter for subsequent storage in said storage means .
3. The apparatus of Claim 1 further comprising the coupling to said storage means being effected through a device which compresses the signal prior to storage.
4. The apparatus of Claim 1, said controller further comprising: means for assembling digital messages stored in said storage means into a packetized data stream containing data and control bits; and means for coupling said packetized data stream to the digital input of said telecommunications interface for transmission over said communication channel.
5. An apparatus as in claim 3 , wherein said controller causes said telecommunications interface to transmit said packetized data stream at a rate that is substantially higher than the transmission rate of digitized voice.
6. The apparatus of claim 1 further comprising a connection to a digital communications channel and an interface therebetween and said controller.
7. The apparatus of claim 1 wherein said digital communication channel and the corresponding interface are designed to handle infrared communications.
8. The apparatus of claim 1 further comprising a bar code reader coupled to said controller.
9. The apparatus of claim 1 further comprising an LCD touchscreen coupled to said controller.
10. An apparatus for communication of audio signals in analog and digital form and for storage of the same, comprising: digital storage means ; a connection to a communication channel; a telecommunications interface having an analog input and output coupled to said connection and a digital input and output ; and a controller coupled to said storage means and said telecommunications interface and comprising; means for assembling digital messages stored in said storage means into a packetized data stream containing data and control bits; and means for coupling said packetized data stream to the digital input of said telecommunications interface for transmission over said communication channel.
11. An apparatus as in claim 9 wherein said controller causes said telecommunications interface to transmit said packetized data stream at a rate that is substantially higher than the transmission rate of digitized voice.
12. An apparatus as in claim 9 wherein said controller includes a module for detecting receipt on the communication channel of a message in HTML language and permitting two-way communication in said language.
13. An apparatus as in claim 9 wherein said controller includes a module for detecting receipt on the communication channel of a message in FTP language and permitting two-way communication in said language.
14. An apparatus as in claim 9 wherein said controller further comprises a speech synthesizer responsive to receipt of text information over said communication channel to produce an audible message simulating said text information being spoken by a human voice.
15. An apparatus as in claim 9 wherein said controller further comprises a database management module for receiving information about stored data and permitting selective retrieval of said information.
16. A method for communication of audio signals in analog and digital form over a communication channel and for storage of the same, comprising the steps of: detecting whether a signal on said channel is an analog or digital audio signal; upon detecting a digital signal on said channel, storing in a digital storage means the output of a telecommunications interface of the type having an input coupled to said channel and a digital output; upon detecting an analog signal on said channel, converting the same from analog to digital form and storing the converted signal in a digital storage means.
17. The method of Claim 15 wherein prior to either of said storing steps said signal is compressed.
18. The method of Claim 15 performed with a telecommunications interface of the type having an analog input and output coupled to said channel and a digital input and output and further comprising the steps of : assembling digital messages stored in said storage means into a packetized data stream containing data and control bits; and coupling said packetized data stream to the digital input of said modem for transmission over said communication channel at a rate that is substantially higher than the transmission rate of digitized voice.
19. A method for communication of audio signals in analog and digital form over a communication channel and for storage of the same, said method being performed with a telecommunications interface of the type having an analog input and output coupled to said channel and a digital input and output and comprising the steps of: assembling digital messages stored in a storage means into a packetized data stream containing data and control bits; and coupling said packetized data stream to the digital input of said modem for transmission over said communication channel at a rate that is substantially higher than the transmission rate of digitized voice.
20. A portable device which permits the user to record, edit, play and review voice messages and other audio material which may be received from, and subsequently transmitted to, a remote apparatus through a communication link, comprising: a receptacle for a power source; integrated circuitry for localized recording, editing, storage and playback of audio signals powered from said receptacle; non-volatile storage means, access to which is controlled by said integrated circuitry; a built-in speaker and microphone coupled with said integrated circuitry for audible playback and local input, respectively, of audio; a telecommunications interface chip set coupled with said integrated circuitry; a modular telephone jack coupled to said modem chip set; the integrated circuitry operating the device so as to transmit and receive audio signals at a rate substantially faster than originally recorded.
21. A device in accordance with claim 19 wherein said integrated circuitry includes a module that is operative to permit distinguishing between analog and digital signals received on the communication link, the analog signals being presented to said integrated circuitry without being processed by said telecommunications interface chip.
22. A device in accordance with claim 19 wherein said integrated circuitry includes a module permitting communication via said communication link over the internet utilizing at least one protocol available thereover.
23. A device in accordance with claim 19 wherein said integrated circuitry includes a module that recognizes a signal received over the communication link as text and converts the signal to a signal emulating the sound of a human voice speaking the text .
PCT/US1998/007228 1997-04-11 1998-04-11 Personal audio message processor and method WO1998047252A2 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
EP98914674A EP1060616A2 (en) 1997-04-11 1998-04-11 Personal audio message processor and method
JP10544101A JP2001503236A (en) 1997-04-11 1998-04-11 Personal voice message processor and method
IL13230698A IL132306A0 (en) 1997-04-11 1998-04-11 Personal audio message processor and method
AU68976/98A AU6897698A (en) 1997-04-11 1998-04-11 Personal audio message processor and method
CA002286043A CA2286043A1 (en) 1997-04-11 1998-04-11 Personal audio message processor and method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US4330297P 1997-04-11 1997-04-11
US60/043,302 1997-04-11

Publications (2)

Publication Number Publication Date
WO1998047252A2 true WO1998047252A2 (en) 1998-10-22
WO1998047252A3 WO1998047252A3 (en) 2000-07-27

Family

ID=21926475

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1998/007228 WO1998047252A2 (en) 1997-04-11 1998-04-11 Personal audio message processor and method

Country Status (7)

Country Link
EP (1) EP1060616A2 (en)
JP (1) JP2001503236A (en)
CN (1) CN1260924A (en)
AU (1) AU6897698A (en)
CA (1) CA2286043A1 (en)
IL (1) IL132306A0 (en)
WO (1) WO1998047252A2 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000036828A1 (en) * 1998-12-16 2000-06-22 Sony Computer Entertainment Inc. Portable electronic device, method of controlling the device and recording medium for recording data used by the device
WO2001069393A1 (en) * 2000-03-15 2001-09-20 E-Device Inc Integrated monolithic electronic component for connection on an internet network
JP2002209273A (en) * 2000-11-17 2002-07-26 Symbol Technologies Inc Equipment and method for radio communication
DE10143450A1 (en) * 2001-09-05 2003-04-03 Oliver Dohn Using mobile telephone as dictation machine with transmission of voice memory over mobile radio network to target computer
DE19829247B4 (en) * 1998-06-30 2008-12-24 Mayah Communications Gmbh Recording, processing and transmission device
CN101547050B (en) * 2001-09-21 2011-07-20 雅马哈株式会社 Audio signal editing apparatus and control method therefor
CN105355230A (en) * 2015-10-14 2016-02-24 宁波萨瑞通讯有限公司 Processing method of high-quality music digital product
US9432516B1 (en) 2009-03-03 2016-08-30 Alpine Audio Now, LLC System and method for communicating streaming audio to a telephone device
US9444868B2 (en) 2000-03-28 2016-09-13 Affinity Labs Of Texas, Llc System to communicate media

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2463922C (en) 2001-06-27 2013-07-16 4 Media, Inc. Improved media delivery platform
US7529847B2 (en) * 2003-03-20 2009-05-05 Microsoft Corporation Access to audio output via capture service
CN1836432B (en) * 2003-06-17 2011-01-26 诺基亚西门子通信有限责任两合公司 More economical resource application on the user interaction with a speech dialogue system in a packet network by means of a simplifying processing of signalling information
EP2442587A1 (en) * 2010-10-14 2012-04-18 Harman Becker Automotive Systems GmbH Microphone link system
CN102025745B (en) * 2010-12-20 2014-06-04 西安西电捷通无线网络通信股份有限公司 Method and system for filtering network packets based on CS (client/server) structure
CN102322928B (en) * 2011-06-15 2014-03-12 天津九安医疗电子股份有限公司 Electronic scale, mobile equipment, body weight measuring system and wireless transmission method
CN108538661A (en) * 2018-05-31 2018-09-14 合肥开关厂有限公司 A kind of mining urgent latch switch
CN108874442A (en) * 2018-06-08 2018-11-23 山东超越数控电子股份有限公司 A kind of implementation method of the Domestic Platform system simulation based on QEMU
CN110221804A (en) * 2019-04-25 2019-09-10 努比亚技术有限公司 A kind of audible notification method, wearable device and computer readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5479411A (en) * 1993-03-10 1995-12-26 At&T Corp. Multi-media integrated message arrangement
US5608786A (en) * 1994-12-23 1997-03-04 Alphanet Telecom Inc. Unified messaging system and method
US5732216A (en) * 1996-10-02 1998-03-24 Internet Angles, Inc. Audio message exchange system
US5768513A (en) * 1996-06-27 1998-06-16 At&T Corp. Multimedia messaging using the internet

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5479411A (en) * 1993-03-10 1995-12-26 At&T Corp. Multi-media integrated message arrangement
US5608786A (en) * 1994-12-23 1997-03-04 Alphanet Telecom Inc. Unified messaging system and method
US5768513A (en) * 1996-06-27 1998-06-16 At&T Corp. Multimedia messaging using the internet
US5732216A (en) * 1996-10-02 1998-03-24 Internet Angles, Inc. Audio message exchange system

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19829247B4 (en) * 1998-06-30 2008-12-24 Mayah Communications Gmbh Recording, processing and transmission device
WO2000036828A1 (en) * 1998-12-16 2000-06-22 Sony Computer Entertainment Inc. Portable electronic device, method of controlling the device and recording medium for recording data used by the device
WO2001069393A1 (en) * 2000-03-15 2001-09-20 E-Device Inc Integrated monolithic electronic component for connection on an internet network
US9444868B2 (en) 2000-03-28 2016-09-13 Affinity Labs Of Texas, Llc System to communicate media
US9621615B2 (en) 2000-03-28 2017-04-11 Affinity Labs Of Texas, Llc System to communicate media
US9923944B2 (en) 2000-03-28 2018-03-20 Affinity Labs Of Texas, Llc System to communicate media
US10341403B2 (en) 2000-03-28 2019-07-02 Affinity Labs Of Texas, Llc System to communicate media
JP2002209273A (en) * 2000-11-17 2002-07-26 Symbol Technologies Inc Equipment and method for radio communication
DE10143450A1 (en) * 2001-09-05 2003-04-03 Oliver Dohn Using mobile telephone as dictation machine with transmission of voice memory over mobile radio network to target computer
CN101547050B (en) * 2001-09-21 2011-07-20 雅马哈株式会社 Audio signal editing apparatus and control method therefor
US9432516B1 (en) 2009-03-03 2016-08-30 Alpine Audio Now, LLC System and method for communicating streaming audio to a telephone device
CN105355230A (en) * 2015-10-14 2016-02-24 宁波萨瑞通讯有限公司 Processing method of high-quality music digital product

Also Published As

Publication number Publication date
JP2001503236A (en) 2001-03-06
EP1060616A2 (en) 2000-12-20
WO1998047252A3 (en) 2000-07-27
CA2286043A1 (en) 1998-10-22
AU6897698A (en) 1998-11-11
CN1260924A (en) 2000-07-19
IL132306A0 (en) 2001-03-19

Similar Documents

Publication Publication Date Title
WO1998047252A2 (en) Personal audio message processor and method
US7310329B2 (en) System for sending text messages converted into speech through an internet connection to a telephone and method for running it
US6600930B1 (en) Information provision system, information regeneration terminal, and server
US5987528A (en) Controlling the flow of electronic information through computer hardware
US7623854B2 (en) Information addition system and mobile communication terminal
US20160142527A1 (en) Methods and apparatuses for programming user-defined information into electronic devices
US6510438B2 (en) Electronic mail system, method of sending and receiving electronic mail, and storage medium
JPH0823383A (en) Communication system
JPH10508397A (en) Ultra-compact personal portable information devices
US8594651B2 (en) Methods and apparatuses for programming user-defined information into electronic devices
US6810077B1 (en) System and method for providing informative communication
US20060015638A1 (en) Method and apparatus for initiating telephone call from a mobile device
JPH11282864A (en) Information processor and its control method
US6216156B1 (en) Internet message communicator with direct output to a hard copy device
CN1351459A (en) Hand communication and processing device and operation thereof
CN100429912C (en) Telephone for telling caller's name
US5909555A (en) Method and system for supporting data communication between personal computers using audio drivers, microphone jacks, and telephone jacks
JPH0830352A (en) Information processor
JP3047264U (en) Internet-only terminal
KR20010095375A (en) Service method of unified e-mail through network communication and machine readable media including memorized program to execute thereof
JPH06189384A (en) Microphone with display device
JPH07123271B2 (en) Electronic mail system
JP2000066871A (en) Voice adaptive portable electronic equipment device
JP2004221857A (en) Multifunction communication equipment
JP2004146896A (en) Universal key and data processing method using the same

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 132306

Country of ref document: IL

Ref document number: 98805009.9

Country of ref document: CN

AK Designated states

Kind code of ref document: A2

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GW HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
ENP Entry into the national phase

Ref document number: 2286043

Country of ref document: CA

Ref document number: 2286043

Country of ref document: CA

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 09402867

Country of ref document: US

ENP Entry into the national phase

Ref document number: 1998 544101

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 1998914674

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

AK Designated states

Kind code of ref document: A3

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GW HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

WWP Wipo information: published in national office

Ref document number: 1998914674

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1998914674

Country of ref document: EP