Navigation project startup | Projects | unofficial empeg BBS

Quick Links: Empeg FAQ | RioCar.Org | Hijack | BigDisk Builder | jEmplode | emphatic
Repairs: Repairs

You are not logged in. [Log In] empegbbs.com » Forums » empeg-car » Projects » Navigation project startup

Page 2 of 3

2

Topic Options

#39771 - 30/11/2001 01:37 Re: Navigation project startup [Re: tonyc]
kim member Registered: 21/07/1999 Posts: 140 Loc: Helsinki, Finland	I was just asking if Kim was using Flite or something else. If he had gotten Flite to output to /dev/audio, he would have solved one of my current problems and I would have begged to see what he did. :) As Rob said, I'm using offline created PCM samples, which are generated with AT&T Labs Natural Voices which is the best TTS I've ever heard and which can be freely used. For the navigation system, it works fairly well. I mostly need numbers anyhow, and I can easily represent any decimal or floating-point number between 0 and 999 in high quality with a single 400KB WAV file. For other speech prompts I need to generate them manually one by one. Kim
Top

#39772 - 30/11/2001 05:21 Re: Navigation project startup [Re: tonyc]
peter carpal tunnel Registered: 13/07/2000 Posts: 4182 Loc: Cambridge, England	...the 4608 byte restriction on /dev/audio. Flite writes 256 bytes at a time. dd obs=4608 ? Peter
Top

#39773 - 30/11/2001 08:05 Re: Navigation project startup [Re: kim]
rob carpal tunnel Registered: 21/05/1999 Posts: 5335 Loc: Cambridge UK	*As Rob said, I'm using offline created PCM samples, which are generated with AT&T Labs Natural Voices which is the best TTS I've ever heard and which can be freely used.* It can be used freely nn times a day from their web site, but a single user desktop licence (with which the PCM output cannot be distributed) costs $49. More useful commercial licences are somewhat more expensive, but still excellent value for money considering the quality. Rob
Top

#39774 - 30/11/2001 09:18 Re: Navigation project startup [Re: peter]
tonyc carpal tunnel Registered: 27/06/1999 Posts: 7058 Loc: Pittsburgh, PA	Yeah... even when I do dd if=file.wav of=/dev/audio obs=4608 I don't get any sound at all. So I'm thinking it has something to do with the Sound Overlay kernel which is currently installed on my Empeg. I'm going to try to back that out and see if I am still getting silent sound output. I just remembered this morning that I had installed that kernel patch. Not sure if that's the problem.. Assuming the WAV file isn't silence, I should get sound output when I run that dd command, right? There's no ioctl's I have to call to change the volume or select PCM source, are there? _________________________ - Tony C my empeg stuff
Top

#39775 - 30/11/2001 11:41 Re: Navigation project startup [Re: tonyc]
kim member Registered: 21/07/1999 Posts: 140 Loc: Helsinki, Finland	So I'm thinking it has something to do with the Sound Overlay kernel which is currently If you open the /dev/audio without the O_SYNC flag, there is no difference in the way how the audio is outputted. Assuming the WAV file isn't silence, I should get sound output when I run that dd command, right? There's no ioctl's I have to call to change the volume or select PCM source, are there? There are multiple. And if you try this so that the player application is not running, it's likely that the soft audio mute is enabled. After you've opened /dev/audio, try this: int iMixer = open( "/dev/mixer", O_RDONLY ); int iSource = SOUND_MASK_PCM; int iFlags = 0; // source not muted int iSAM = 0; // SAM is off int iVolume = 100 \| (100 << 8); ioctl( iMixer, _IOW( 'm', 0, int ), &iSource ); // set source ioctl( iMixer, _IOW( 'm', 1, int ), &iFlags ); // set flags ioctl( iMixer, _IOW( 'm', 15, int ), &iSAM ); // set Soft Audio Mute ioctl( iMixer, MIXER_WRITE( SOUND_MIXER_VOLUME ), &iVolume ); close( iMixer ); Kim
Top

#39776 - 30/11/2001 11:50 Re: Navigation project startup [Re: kim]
tonyc carpal tunnel Registered: 27/06/1999 Posts: 7058 Loc: Pittsburgh, PA	Aaaaaaah I think it is that soft audio mute... I was confusing that with cell-phone mute which I don't have tied to anything... I didn't know that SAM was enabled when the player exits. That's the problem. Thanks muchly. What the hell is soft audio mute supposed to do anyway? _________________________ - Tony C my empeg stuff
Top

#39777 - 30/11/2001 14:34 Re: Navigation project startup [Re: kim]
kazama enthusiast Registered: 11/11/2000 Posts: 202 Loc: Boston, MA	OK, so you knwo I would be posting on this given recent events. L&H's website also has a free converter for test-to-speech with a female voice. IMHO, the AT&T Labs version sounds much better but at least you have another resource. I can't wait to get my hands on some of this technology to play with. Greg
Top

#39778 - 30/11/2001 15:12 Re: Navigation project startup [Re: kim]
tonyc carpal tunnel Registered: 27/06/1999 Posts: 7058 Loc: Pittsburgh, PA	Okay Kim you seem to have a handle on this sound stuff... explain this to me. I can now hear output, but I can't set any of the parameters like are set in the pcmplay example... So 8 KHz wave files come out sounding like a record playing super fast. In order to get pcmplay to work, I have to comment out all of the mixer ioctl's which set the format, frequency, etc... Or I get something like this: ioctl(SNDCTL_DSP_SETFMT): Invalid argument Here's the section I had to comment out to get it to play anything: format = AFMT_S16_LE; if (ioctl(fd, SNDCTL_DSP_SETFMT, &format) == -1) { perror("ioctl(SNDCTL_DSP_SETFMT)"); return -1; } if (format != AFMT_S16_LE) { fprintf(stderr, "AFMT_S16_LE not available\n"); return -1; } stereo = 1; if (ioctl(fd, SNDCTL_DSP_STEREO, &stereo) == -1) { perror("ioctl(SNDCTL_DSP_STEREO)"); return -1; } if (!stereo) { fprintf(stderr, "stereo selection failed\n"); return -1; } speed = 44100; if (ioctl(fd, SNDCTL_DSP_SPEED, &speed) == -1) { perror("ioctl(SNDCTL_DSP_SPEED)"); return -1; } if (speed != 44100) { fprintf(stderr, "sample speed 44100 not available (closest %u)\n", speed); return -1; } Any idea what could be happening here? _________________________ - Tony C my empeg stuff
Top

#39779 - 30/11/2001 17:00 Re: Navigation project startup [Re: tonyc]
kim member Registered: 21/07/1999 Posts: 140 Loc: Helsinki, Finland	Okay Kim you seem to have a handle on this sound stuff... explain this to me. I can now hear output, but I can't set any of the parameters like are set in the pcmplay example... The pcmplay example is obsolete... So 8 KHz wave files come out sounding like a record playing super fast. In order to get pcmplay to work, I have to comment out all of the mixer ioctl's which set the format, frequency, etc... The audio input is locked at 44KHz, 16-bit stereo, little-endian signed with buffer size of 4608 bytes (size of one mpeg frame). If the program only outputs PCM at 8KHz, you need to manually convert it into 44KHz . Kim
Top

#39780 - 30/11/2001 17:09 Re: Navigation project startup [Re: kim]
tonyc carpal tunnel Registered: 27/06/1999 Posts: 7058 Loc: Pittsburgh, PA	The audio input is locked at 44KHz, 16-bit stereo, little-endian signed with buffer size of 4608 bytes (size of one mpeg frame). Doh! Is that new or has it always been that way? The pcmplay example would have one believe you can select mono/stereo, 11/22/44 KHz, etc... I knew the size of the buffer was locked... This is a bit disappointing to my plans to get text-to-speech on the player. Sigh. _________________________ - Tony C my empeg stuff
Top

#39781 - 30/11/2001 17:22 Re: Navigation project startup [Re: tonyc]
tonyc carpal tunnel Registered: 27/06/1999 Posts: 7058 Loc: Pittsburgh, PA	So, continuing the thoughts in my last post... If it's indeed impossible to change sample rates, etc... Would it be possible to make some kernel modifications to enable the playing of formats other than 16-bit 44.1khz stereo? I mean this is REALLY limiting for a product whose primary purpose is as an audio player! I know this locking was chosen to keep the visuals in sync with the audio, but is that saying that there are no ways to keep the visuals in sync and still be able to accept other formats? _________________________ - Tony C my empeg stuff
Top

#39782 - 30/11/2001 17:53 Re: Navigation project startup [Re: tonyc]
tfabris carpal tunnel Registered: 20/12/1999 Posts: 31633 Loc: Seattle, WA	That sounds more like a DSP limitation than a kernel limitation to me. Mind you, I know nothing about this stuff, it's just an observation off the top of my head. If that's true, then any translation/resampling work would have to be done in software. _________________________ Tony Fabris
Top

#39783 - 30/11/2001 20:58 Re: Navigation project startup [Re: tfabris]
tonyc carpal tunnel Registered: 27/06/1999 Posts: 7058 Loc: Pittsburgh, PA	Yeah it appears there's no way to play anything other than 44.1 khz stereo. Sigh. So I have to figure out a way to do some conversion. I have zero experience in this. Anyone have any idea how one would "upsample" an 8000 or 11000 khz mono wave to 44.1 khz stereo? After about a half an hour of looking I couldn't find any way in the flite code to output at 44.1 or stereo. So I'd have to take its output and upsample it before it writes it... I just have no clue how this would be done. This is bound to be very inefficient. CRAP. I am so angry about this limitation, even though I know why it's there I wish there was like a second audio device that didn't have such stringent requirements. The comments in empeg_audio3.c state "wishlist: sample rate adjustment with antialiasing filters." If this would allow us to write non-standard sample rate files to the audio device, then I hope this wish comes true. Hugo, are you listening? Or maybe there are some other smart people out there who could handle something like this, since the likelihood of this becoming an official feature is rather slim. Sigh. _________________________ - Tony C my empeg stuff
Top

#39784 - 01/12/2001 03:26 Re: Navigation project startup [Re: tonyc]
rob carpal tunnel Registered: 21/05/1999 Posts: 5335 Loc: Cambridge UK	That's the rate supported by the hardware. If it isn't convenient you can resample in software - it's not such a great programming challenge really (i.e. I'm sure someone on here knows how to do it - we had to!). Rob
Top

#39785 - 01/12/2001 03:28 Re: Navigation project startup [Re: tonyc]
rob carpal tunnel Registered: 21/05/1999 Posts: 5335 Loc: Cambridge UK	Check here.. http://www-ccrma.stanford.edu/~jos/resample/
Top

#39786 - 01/12/2001 11:21 Re: Navigation project startup [Re: rob]
tonyc carpal tunnel Registered: 27/06/1999 Posts: 7058 Loc: Pittsburgh, PA	Thanks for the link, Rob. Now, does anyone have any pointers on how to convert from mono to stereo? I want to do sample rate conversion and mono->stereo conversion in real-time so I'll have to graft this resampling code in, but I also need something to switch it to stereo.. any ideas? _________________________ - Tony C my empeg stuff
Top

#39787 - 01/12/2001 11:57 Re: Navigation project startup [Re: tonyc]
Nosferatu enthusiast Registered: 24/08/2001 Posts: 344 Loc: France, Champagne	A question for Smu : Will you consider localization in audio sounds or just let english sounds ? _________________________ Empeg IIa - 10 Gb - Red Fascia - Tuner, the day is coming - I Will Strike From the Grey -
Top

#39788 - 01/12/2001 12:24 Re: Navigation project startup [Re: tonyc]
kim member Registered: 21/07/1999 Posts: 140 Loc: Helsinki, Finland	Now, does anyone have any pointers on how to convert from mono to stereo? Erm... Just output the same sample twice . Each sample is one 16-bit integer for one channel. For stereo, just output the same 16-bit integer twice in a row (for left and right channel). For the sample rate conversion, from 11KHz to 22KHz or 44KHz is very easy as you only need to output the same sample either two or four times, as they are multiples. 8KHz is more difficulty as you'd need to filter it upwards, thus more expensive. For instance, converting 11KHz mono sound to 44KHz stereo sound, you just output the same sample 8 times (x 4 for sample rate conversion x 2 for stereo). Kim
Top

#39789 - 01/12/2001 12:51 Re: Navigation project startup [Re: kim]
tonyc carpal tunnel Registered: 27/06/1999 Posts: 7058 Loc: Pittsburgh, PA	Thanks for the excellent information, Kim. I figured stereo would be something like that, but I wasn't sure if it was a block of left samples followed by a block of right samples, or whether channels alternated on each sample. Thanks for the clarification. The Flite software only comes with an 8 Khz voice right now. Future versions might have 11khz voices. I am going to try to mix in some of the upsampling code that Rob pointed me to and then try to hack in the mono-stereo conversion on my own. After all of that work, I can only pray that it performs fast enough to allow for somewhat real-time text-to-speech. Hey, I've been wanting to dig into a more low-level project on the Empeg, so this isn't totally a bad thing. Should be fun. _________________________ - Tony C my empeg stuff
Top

#39790 - 01/12/2001 13:48 Re: Navigation project startup [Re: tonyc]
hybrid8 carpal tunnel Registered: 12/11/2001 Posts: 7738 Loc: Toronto, CANADA	Don't forget to include a hidden option for taunts and other insults. That way you can trigger them from the remote without having them visible on the screen. Passengers will get a kick out of it. :) Bruno _________________________ Bruno Twisted Melon : Fine Mac OS Software
Top

#39791 - 01/12/2001 17:38 Re: Navigation project startup [Re: Nosferatu]
smu old hand Registered: 30/07/2000 Posts: 879 Loc: Germany (Ruhrgebiet)	Hi. I don't intend to do the sound output stuff myself (yet), so this wil depend on co-developers. Anyhow, my nav project is advancing _really_ slow currently, my diploma thesis is top priority for me. However, I would like the software to have at least english and german speak output. No streetnames though, if there is no realtime TTS software that is completely free (BSD style license or something like that). cu, sven _________________________ proud owner of MkII 40GB & MkIIa 60GB both lit by God and HiJacked by Lord
Top

#39792 - 02/12/2001 08:16 Re: Navigation project startup [Re: smu]
Dearing addict Registered: 22/07/1999 Posts: 453 Loc: Florida	Sven, It might be a good idea to incorporate some IVR (Interactive Voice Response) tactics in the design of your Nav. Mainly, allowing the user to record his/her own prompts. This works for the majority of your grammar (North, South, 10 Miles, etc), and allows for dynamic data like street names. We've been doing this in the IVR industry for decades. If you want, email me off-list and I can give you some ideas. Jason _________________________ _~= Dearing =~_ Gettin' back into it thanks to slimrio!
Top

#39793 - 02/12/2001 15:11 Re: Navigation project startup [Re: hybrid8]
xavyer member Registered: 19/12/1999 Posts: 117	Along with that little red button under the gear shift knob?
Top

#39794 - 02/12/2001 17:24 Re: Navigation project startup [Re: Dearing]
smu old hand Registered: 30/07/2000 Posts: 879 Loc: Germany (Ruhrgebiet)	Hi Jason. My plans go in that direction, but as of now, I am too far away from that stage of development to actually think about the deeper program structures. But I promise to keep all that in mind. cu, sven _________________________ proud owner of MkII 40GB & MkIIa 60GB both lit by God and HiJacked by Lord
Top

#39795 - 13/12/2001 20:47 Re: Navigation project startup [Re: kim]
mlord carpal tunnel Registered: 29/08/2000 Posts: 14526 Loc: Canada	The existing Linux/unix "sox" program is excellent for performing rate/channel/format conversions to/from just about any sound file format. "Sound eXchange : universal sound sample translator"
Top

#39796 - 14/12/2001 08:20 Re: Navigation project startup [Re: mlord]
tonyc carpal tunnel Registered: 27/06/1999 Posts: 7058 Loc: Pittsburgh, PA	Oh.... maybe I'll steal some of the code from there instead of rolling my own. It looks like SoX is made for converting files, whereas I really want to convert the samples as they're generated by the text-to-speech program. I guess I can find the relevant pieces of code and hammer them into the TTS software's source somewhere. Certainly not as elementary as a /dev/audio that accepts other sample rates, but certainly doable. _________________________ - Tony C my empeg stuff
Top

#39797 - 14/12/2001 10:34 Re: Navigation project startup [Re: tonyc]
mlord carpal tunnel Registered: 29/08/2000 Posts: 14526 Loc: Canada	Sox will work in a pipe as well, but the buffering may or may not match your needs.
Top

#39798 - 14/12/2001 13:21 Re: Navigation project startup [Re: mlord]
tonyc carpal tunnel Registered: 27/06/1999 Posts: 7058 Loc: Pittsburgh, PA	Ah... I will toy with it this weekend probably, then. _________________________ - Tony C my empeg stuff
Top

#39799 - 02/02/2002 20:39 Re: Navigation project startup [Re: kim]
TheAmigo enthusiast Registered: 14/09/2000 Posts: 363	Kinda thinking aloud here: 44100 / 8000 is almost 5.5. 8000 * 5.5 = 44000 Close enough that the playback speed would be off by 0.23%. Prolly not enough to make a difference for TTS. Off hand, I don't know how to do the half in there. If you read in pairs of samples and output 11 at a time, how do you do it? A: input = (-14, 120) output = (-14, -14, -14, -14, -14, -14, 120, 120, 120, 120, 120) B: input = (-14, 120) output = (-14, -14, -14, -14, -14, 53, 120, 120, 120, 120, 120) C: none of the above... something better. At least A and B would be pretty easy to program and not very CPU intensive. _________________________ --The Amigo
Top

#39800 - 06/02/2002 19:02 Re: Navigation project startup [Re: smu]
eternalsun Pooh-Bah Registered: 09/09/1999 Posts: 1721 Loc: San Jose, CA	Smu, Is the navigation project going well? If you guys would like, I can contribute to the project in the way of maps. I have several contacts over at NavTech. While a deal directly with NavTech is not likely to be struck, there are backchannels and other methods for obtaining the license cheaply. If there is already a source for the maps, then I'll let this slide. Let me know. Calvin
Top

Page 2 of 3

2

View All Topics