Quantcast
Channel: About NeoSpeech Text-to-Speech – Text2Speech Blog
Viewing all 34 articles
Browse latest View live

NeoSpeech’s Text-to-Speech Embedded SDK Overview

$
0
0

Embed your mobile app or device with NeoSpeech’s natural sounding text-to-speech voices.

Embedded Text to Speech

This blog is part 3 of a 7-part blog series highlighting each of NeoSpeech’s text-to-speech solutions

Giving your smart phone application or mobile device a voice shouldn’t be difficult, which is why we made it easy.

NeoSpeech’s VoiceText Embedded SDK package was specifically designed with embedded operating systems in mind. Mobile apps and embedded devices run on very specific operating systems, have limited storage space, and can have a low amount of processing power. Finding a high quality text-to-speech engine that can be compatible with an embedded product’s specifications has long been a challenge for developers.

Fortunately, our Embedded SDK package is the solution that easily overcomes this challenge.

So who is this solution for? What does it come with? And how does it work? Let’s take a look at everything you should know about our VoiceText Embedded SDK package.

What is it?

In terms of what the package is composed it, our Embedded package is nearly identical to our Engine package. When you purchase our VoiceText Embedded SDK, we send you the complete text-to-speech engine and SDK.

Text-to-Speech Engine

This is what makes the magic happen! It’s what gives your product a voice! The text-to-speech engine, or TTS engine for short, is able to analyze inputted text and convert it into speech. The method of how it puts together the speech depends on the type of engine you get.

Either way, with the TTS engine in hand, you’ll be able to convert any amount of text into high quality speech instantly!

SDK

The software development kit (SDK) is what will enable you to easily integrate the TTS engine into your custom application or device. An SDK is defined as a set of tools that a software developer uses to help build applications.

So if integrating a powerful TTS engine into an embedded device sounds daunting, don’t sweat it. Our SDK (which includes a very informative manual on how to use it) makes the whole process simple and straight forward.

Who is this solution for?

If we had to sum it up in one sentence, we’d say that the VoiceText Embedded SDK package is for people wanting to give their embedded device or application a voice by hosting the TTS engine locally.

First, let’s define what an embedded device and an application are.

An embedded device is a device that contains a special computing system for a dedicated purpose. These aren’t general-purpose computers like your laptop that can do anything. Instead, embedded devices usually have one or a few specific purposes.

An example is a blood glucose monitor. This is a device that tests the levels of glucose in your blood. People suffering from diabetes often use blood glucose monitors to make sure their blood-sugar levels aren’t too low or too high.

Blood glucose monitors have a specific purpose. They test blood samples and display back the results. They aren’t general-purpose computers that can perform most common tasks. They are embedded devices. Other examples include ATMs, DVD players, smart home devices, and a GPS device.

An application, at least in this context, refers to an application for a mobile device. If you have a smartphone, then you probably have dozens of apps downloaded onto your device. Our VoiceText Embedded SDK is perfect for integrating text-to-speech into smartphone apps.

Now that you know what we mean by “embedded device or application”, you should have a pretty good idea if the Embedded SDK package is right for you. Is the product you’re developing going to be an embedded device? Or is it going to be an app for a smartphone? If you answered yes to either of these two questions, then our Embedded SDK package is probably right for you.

Note that in our one sentence summary of who the Embedded SDK package is for, we mentioned that it’s for those who want to host their TTS engine locally. Hosting the TTS engine locally means that the engine is located and stored within the device or app itself. The other option would be host your TTS engine either in your own server or through NeoSpeech’s cloud.

How does NeoSpeech optimize it for my needs?

As mentioned above, embedded devices and applications have very unique operating systems, specifications, and storage space limitations. Therefore, we at NeoSpeech make sure to optimize our Embedded SDK package to fit your needs.

Our Embedded SDK package works with several operating systems, including iOS, Android, Embedded Linux, WinCE, and Windows Mobile. We can also custom port our engine to your specific operating system if we don’t already support it.

You can also choose your desired sampling rate (which affects sound quality) and footprint size. Our embedded TTS engines can range anywhere from 8MB to 900MB. Which footprint size and sampling rate you need depends on your device specifications and the application for which the TTS engine is being used for.

How does it work?

Here is a look at how you can use our VoiceText Embedded SDK package to integrate text-to-speech into your embedded device or app.

Purchasing process

There are no mind-bending hoops you have to jump through to get your hands on our Embedded SDK package. Just get in touch with our Sales Team. We’ll take it from there and make sure we get all the information we need so we optimize the package to your needs.

Installation

After purchasing the package, we’ll send you the complete TTS engine and SDK for download. You or your developer will need to download these files on the computer you’re using to develop your product.

Verification

Before you can do anything, you have to verify your software. In the same email we send you with the download files, we’ll send you a license key that’ll activate and unlock your TTS engine.

Integration

Here’s where the fun begins. At this point it’s time to integrate the TTS engine into your custom application or device.

To get started, just take a look at the SDK manual that comes with the SDK package. In there you’ll find clear, easy to follow, step-by-step directions on what to do.

Basically, you’ll just need to copy the TTS engine and a few files from the SDK into your device or application. Then, all you need to do is write out the commands that’ll be the blueprint that tells your product how and when to use the TTS engine to create speech. That’s it!

Our SDK manual will go over all the commands you could use and how they work. With that in hand, you’ll give your embedded product an engaging, natural sounding voice in no time.

See? We told you it was easy.

NeoSpeech's embedded text-to-speech works for smartphone apps

That’s the overview of NeoSpeech’s VoiceText Embedded SDK solution! If you’ve decided that this is the solution for you, click the button below to get started!

I want the VoiceText Embedded package!

Be sure to check out the other blogs in this series to learn more about our other solutions. And if you need help figuring out which solution is right for you, read our free eBook on the matter.

Here’s the list of blogs in this series covering NeoSpeech’s solutions:

Learn more about NeoSpeech’s text-to-speech

Want to learn more about all the ways Text-to-Speech can be used? Visit our Text-to-Speech Areas of Application page. And check out our Text-to-Speech Products page to find the right package for any device or application.

If you’re interested in integrating Text-to-Speech technology into your product, please fill out our short Sales Inquiry form and we’ll get you all the information and tools you need.

Related Articles

New eBook, “Which Text-to-Speech Product Is Right For You?” released

Check out NeoSpeech’s Interactive Online Text-to-Speech Demo!

What is Text-to-Speech and How Does It Work?

Don’t forget to follow us on LinkedIn, Facebook, Google+, and Twitter!

The post NeoSpeech’s Text-to-Speech Embedded SDK Overview appeared first on Text2Speech Blog.


NeoSpeech Is Working On Singing Text-to-Speech Voices

$
0
0

NeoSpeech singing text to speech

Text-to-speech has become a mainstream technology due to its power to make speech more accessible. It gives both humans and machines the ability to speak. By receiving text prompts, a text-to-speech engine is able to instantly synthesize speech that can sound as realistic as an actual human.

However, there’s so much more that humans can do with their voices that text-to-speech hasn’t been able to replicate in the past. One such example is singing. People can use their voices to produce musical tones while speaking. Modern text-to-speech engines do not have the ability to replicate musical tones.

Now, we at NeoSpeech are working on taking our natural sounding text-to-speech voices and giving them the ability to sing beautiful melodies. Much like how text-to-speech made the ability to speak more accessible, we want to make the ability to sing more accessible.

Want to hear it for yourself? Here’s an example of what our Japanese singing text-to-speech voice sounds like:

How Singing Text-to-Speech Works

Singing text-to-speech isn’t necessarily new. It’s actually been around for a while. In fact, last year we covered the story of Hatsune Miku, the text-to-speech pop star.

Vocaloid computer programs like the one that lets users create songs by the persona Hatsune Miku can synthesize singing. Each program is created by having a voice actor record phonemes in different pitches. Then, using the computer program, users can put these phonemes together to create lyrics and melodies.

 The problem with these programs is that they’re usually limited to just one voice and that they don’t sound human-like.

NeoSpeech is looking to create more natural sounding singing text-to-speech voices through the use of HMM-based text-to-speech.

As discussed in our article, “Which Speech Synthesis Technique is Better?”, the HMM method is a statistical parametric synthesis technique that generates speech by taking recordings from a voice actor and modifying it to sound as similar as possible to the inputted text.

Basically, HMM-based text-to-speech engines don’t fully preserve the voice of the original voice actor when generating speech. It is able to modify the speech to sound the way the engine believes it is supposed to sound.

This flexibility makes HMM-based text-to-speech engines perfect for creating singing voices. It enables the engine to create high and low notes to transform normal words and sentences into songs.

With the engine’s ability to modify the voice in place, the last step is giving the text-to-speech engine knowledge of the music. Much like how a regular text-to-speech engine needs to know the text, a singing text-to-speech engine will also need to have knowledge of the musical structure.

There are several ways this can be achieved. One way is to input musical data into the singing text-to-speech engine with a musical score data sheet from a program like MuseScore. MuseScore is a simple tool that lets you build musical scores (like the one below) on your computer.

musescore sheet

Once the HMM-based text-to-speech engine receives the musical data and the text (or lyrics), it can generate the audio file!

Like we said though, that’s just one way out of many that singing text-to-speech can be achieved. We at NeoSpeech have been experimenting the best ways to create a singing text-to-speech voice. While there’s still some time till we perfect the process, we can happily say that we’ve already had some great results!

NeoSpeech’s Singing Text-to-Speech Samples

Using the same voice in the above example, we gave our singing Japanese text-to-speech voice the musical data and lyrics to Jingle Bells (in Japanese, of course)! Here’s a sample of the synthesized singing:

Then, we threw in some background music to truly make it sound like a professional song recording:

Pretty cool, right? For now, we at NeoSpeech have only worked with our Japanese voices on singing text-to-speech. Don’t worry though, we’re not forgetting about our other languages!

listening to NeoSpeech synthesized singing

Singing text-to-speech is still in development, but NeoSpeech is working hard to push the boundaries of what text-to-speech is capable of and we’re hoping to make the ability to sing accessible to all in the near future!

What do you think?

Did you enjoy our singing text-to-speech examples? What are you most excited about using singing text-to-speech for? Let us know in the comments!

Learn More about NeoSpeech’s Text-to-Speech

Want to learn more about all the ways Text-to-Speech can be used? Visit our Text-to-Speech Areas of Application page. And check out our Text-to-Speech Products page to find the right package for any device or application.

If you’re interested in integrating Text-to-Speech technology into your product, please fill out our short Sales Inquiry form and we’ll get you all the information and tools you need.

Related Articles

Check out NeoSpeech’s Interactive Online Text-to-Speech Demo!

Learn how to enable NeoSpeech’s Text-to-Speech voices in Microsoft Word

What is Text-to-Speech and How Does It Work?

The post NeoSpeech Is Working On Singing Text-to-Speech Voices appeared first on Text2Speech Blog.

How to Integrate Text-to-Speech in the Internet of Things

$
0
0

Want to give your IoT product a voice? Still not convinced that text-to-speech is a must have feature for an IoT device? Keep on reading to find all the answers you need.

Text to speech in the internet of things

If you’re familiar with the Internet of Things (IoT), then you already know that IoT is all about being connected, sharing information, and communication. IoT separates great devices from ordinary ones because of the value it can provide to the user. (If you’re not familiar with IoT yet read our article on it here!)

Text-to-Speech technology can be viewed the same way. Text-to-speech makes interactions between devices and humans more natural and engaging. You could say that text-to-speech is also about being connected, sharing information, and communication.

It’s no wonder that many of today’s IoT devices feature text-to-speech.  If current trends continue, almost all IoT devices will feature text-to-speech in the next few years.

So how does a manufacturer go about integrating text-to-speech into an IoT device? After all, these devices usually have limited computing power and storage space, and must work in real-time.

Thankfully, it can be a lot easier than you think with NeoSpeech. Whether you want to host a text-to-speech engine on your device, in our cloud, or in your own cloud, we have the solutions that can make it happen without a hassle.

So which option is right for you? And how can we make that happen? Keep on reading to find your answers.

Why does IoT need a voice?

In case you’re not convinced yet that an IoT device should have a text-to-speech voice, hear us out.

Our phones, alarm clocks, thermostats, appliances, and just about any other device that can connect to the internet is a part of the Internet of Things. These devices use their ability to connect to the internet to communicate information to other devices and people.

This makes IoT devices much smarter than regular devices. Your thermostat at home can record that the temperature in your house is 76 degrees and communicate it to your car as you’re driving home. After your car relays the message to you, you can tell your car to set the thermostat to 68 degrees. Your car would then communicate that to the thermostat which would set the new temperature.

These two IoT devices were able to communicate and share information with each other, which makes both devices more accessible, usable, and valuable.

IoT is all about communicating and sharing information. Therefore, shouldn’t your IoT device be able to communicate and share information with its user in the most natural, humanlike way possible?

Text-to-speech enables your devices to talk to your users. This makes the device appear smarter. It also makes it much easier to use. Your users would be able to perform other tasks, such as driving or cooking, while hearing information from whatever IoT device they’re using.

Without text-to-speech, your IoT device would require a screen that users would have to focus on to read whatever message it has. Frankly, this is starting to become outdated.

After all, if an IoT device is able to communicate efficiently with other devices, shouldn’t it be able to communicate with people as efficiently as possible too?

How to give IoT a voice

In order to have a voice, you need a text-to-speech engine. The engine is where the database of recordings from the voice actor is stored. It’s what is able to convert inputted text into audible speech instantly.

When integrating text-to-speech into your IoT product, the first thing you need to decide is where to store the text-to-speech engine.

The engine can be installed locally on your device, within an IoT hub, in your cloud, or in NeoSpeech’s own cloud.

Stored locally

If your device has a sufficient amount of computing power and storage space, and/or if your product won’t always be connected to the internet, then you want the engine to be installed locally on the device.

Some IoT devices are meant to go wherever the user goes. Many times the user will go where there is no internet connection. If that’s the case, then the IoT device must be able to work without the internet.

By storing the engine within the device, you can ensure that the text-to-speech function will always work. Whenever the device needs to convert a string of text into speech, it’ll send the request to the engine stored in the device, instantly convert it into a speech output, and then play the speech.

To make this happen, all you need is NeoSpeech’s VoiceText Embedded SDK package. This package will come optimized for your specific embedded device system requirements (and our embedded engines have small footprints so they don’t take up too much memory). You’ll get the text-to-speech engine, which you would store in your device, and then an SDK which will make it very easy for you to program your device to interact with the engine exactly how you want it.

Stored in an IoT hub

This solution is perfect for those making a suite of smart home products that all connect to one hub.

A smart home hub is a device that connects and controls all of your other smart home devices. Amazon Echo can be thought of as a hub, as you can use it to control your other smart home devices (such as telling the Echo to turn off your upstairs lights).

Smart home hubs usually have more processing power and storage space than other IoT devices, which makes it easier to store the engine in it.

When the smart home hub receives a text-to-speech request from end-device, it can convert it into speech and then either play the speech through the speakers in the hub, or send the audio file back to the end-device which then plays the audio.

Either way, we have the text-to-speech packages that will make it easy for you to build.

And what makes this solution great is that the devices don’t need to be able to connect to the internet to for the text-to-speech conversions, it only needs to be able to connect to the smart home hub through a local network connection.

In your cloud

If your IoT device functions by always having a connection to your company’s server(s), then we have good news for you! Our VoiceText TTS Server SDK package lets you integrate our text-to-speech engine into your server.

This process is just as simple as the ones mentioned above. All you need to do is install our engine and Server SDK in your server, and then use our Server SDK to program how your product will use the engine and when.

Then, whenever your IoT device needs to make speech, it will just send the request to your server, which will then instantly send back the audio file of the speech to play.

This enables you to free up limited memory space and processing power on your IoT device, while still allowing it to perform text-to-speech conversions in real-time

In NeoSpeech’s cloud

Did the option above sound perfect for you, except for the part where you don’t have a server? Don’t worry, you can just connect all of your IoT devices to NeoSpeech’s cloud through our Web Service platform.

For this, we’ll just send you the API your programmer needs to allow your IoT devices to connect to our text-to-speech server.

Once it’s set up, whenever your IoT device needs to say something, it’ll send the request to our cloud which will send back the speech.

Using NeoSpeech for IoT

Text-to-speech and IoT can be a match made in heaven. Giving your IoT device a voice will make it easier to use and increase customer satisfaction. Plus, with NeoSpeech’s suite of text-to-speech solutions, it can be very easy to integrate text-to-speech, just give us a call and we’ll help make it happen!

What do you think?

Do you think all IoT devices should feature text-to-speech? How do you see speech technology and IoT growing in the next few years? Let us know in the comments!

Learn More about NeoSpeech’s Text-to-Speech

Want to learn more about all the ways Text-to-Speech can be used? Visit our Text-to-Speech Areas of Application page. And check out our Text-to-Speech Products page to find the right package for any device or application.

If you’re interested in integrating Text-to-Speech technology into your product, please fill out our short Sales Inquiry form and we’ll get you all the information and tools you need.

Related Articles

Check out NeoSpeech’s Interactive Online Text-to-Speech Demo!

Intelligent Virtual Assistants Continue to Choose NeoSpeech’s Text-to-Speech Voices

What is Text-to-Speech and How Does It Work?

 

Don’t forget to follow us on LinkedInFacebookGoogle+, and Twitter!

The post How to Integrate Text-to-Speech in the Internet of Things appeared first on Text2Speech Blog.

NeoSpeech SAPI Text-to-Speech Overview

$
0
0

Are you in need of high quality text-to-speech voices that are compatible with any SAPI compliant application? Read on to find out how NeoSpeech makes it easy.

NeoSpeech SAPI text to speech

This blog is part 4 of a 7-part series highlighting each of NeoSpeech’s text-to-speech solutions.

SAPI, or Speech Application Programming Interface, is an API developed by Microsoft that allows the use of speech technology in Windows applications. SAPI can be thought of as a set of rules and protocols that developers follow to integrate speech functions, such as text-to-speech, into their SAPI compliant application.

For this reason, SAPI is well-known in the speech technology world because it is accessible to all, easy to follow, and used by hundreds of the most popular Windows applications today.

This is why we at NeoSpeech released our VoiceText SAPI product a long time ago. Our SAPI text-to-speech voices sound as natural and realistic as NeoSpeech’s voices have always been known to sound. However, these voices are special because they were developed with SAPI’s specifications and protocols in mind.

So what does that mean? It means that by using our VoiceText SAPI, you can integrate any of NeoSpeech’s voices into any Windows application that uses SAPI! It’s simple, easy, and produces great results!

Let’s take a deeper look at how you can give your Windows application a lifelike voice with NeoSpeech’s VoiceText SAPI.

What is SAPI?

If you’re still scratching your head and not too sure what SAPI is, don’t worry, this short explanation should clear that up.

An API (Application Programming Interface) is a set of rules, protocols, and tools that specify how software components should interact with each other. This video does a fantastic job explaining what exactly APIs do. It enables applications to communicate with each other.

Let’s say you’re on your smart phone and you look up the address of a movie theater in your browser app. You want to find out how to drive there, so you click the address and it automatically opens up your Maps app with the address already put in it. In this case, your browser app and the Maps app were able to communicate with each other thanks to an API that defined how the two apps would talk to each other.

SAPI is a speech-specific API (hence adding the “S” in front of “API”) that enables the use of speech recognition and speech synthesis within Windows applications.

So if you download a text-to-speech voice on your computer that is SAPI compliant, you can follow the protocols set by SAPI and write the code that will tell your Windows applications how to use the text-to-speech voice. Once you’re done, your Windows application will use the voice in the exact way that you programmed it to use the voice.

Sounds easy enough, right? Even better is that many SAPI applications already have text-to-speech functionality built into them, so all you need to do is download and install the SAPI voices and you’re ready to go! No coding required!

Here is an example of a few Windows applications that are SAPI compatible:

  • Microsoft Office apps (Excel, Word, etc.)
  • Microsoft Narrator
  • Microsoft Voice Command
  • Adobe Reader
  • Adobe Captivate
  • JAWS
  • Windows-Eyes
  • CoolSpeech
  • NonVisual Desktop Access

Any application that is SAPI compliant can use any of NeoSpeech’s voices thanks to our VoiceText SAPI.

What is NeoSpeech’s VoiceText SAPI?

 NeoSpeech’s VoiceText SAPI isn’t much different from our Engine, Server, or Embedded SDK packages. With all of those packages, you download our text-to-speech engine(s) and the SDK which gives you direction on how to build an application using our voices.

Our VoiceText SAPI product is essentially a separate version of our Engine, Server, or Embedded package which was developed to be compatible with SAPI. So, with our VoiceText SAPI, you can integrate NeoSpeech’s voices into any engine, server, or embedded SAPI application!

Who is this solution for?

The Engine, Server, and Embedded applications mentioned earlier are for people who want to build custom applications. VoiceText SAPI is for people who want to simply integrate our voices into already existing Windows applications (as long and they’re SAPI compliant), or for people who want to build their own custom Windows application that is SAPI compliant.

Do you want to use Microsoft Word to create voice prompts? Do you have a Windows Server application for your call center and you want to integrate our text-to-speech voices into it? Are you using an authoring tool or eLearning application that could use a better sounding voice? If so, then VoiceText SAPI is what you need!

How does NeoSpeech optimize it for my needs?

We’ll make you to send you the right VoiceText SAPI package for your needs. We have SAPI versions of our Engine, Server, and Embedded SDK packages. Whether you need the Engine, Server, or Embedded version depends on the type of device and application you’re working with.

All of our voices have 32bit and 64bit SAPI versions. The kind you need depends on your operating system.

To make all of this simple, just tell our sales team what kind of computer you’re using, what voices you want, and what application you want to integrate them into. We’ll make sure the package we send you meets all of your specifications.

How does it work?

Integrating our SAPI text-to-speech voices into your Windows application is a very simple process with VoiceText SAPI. Let’s go over the steps:

Purchasing process

Do you need SAPI voices or are you considering it? Get in touch with our Sales Team by filling out our online form! We will then quickly get in contact with you, get all the information we need, and answer any questions you may have. Then, our Sales Team will send you your VoiceText SAPI package!

Installation

We’ll send the complete package to your email address. Just click the download link and follow the instructions to get our SAPI voices downloaded and installed on your computer.

Verification

In that same email, we’ll send you instructions on how to verify that your copy of our software is not pirated. This only takes a few minutes but is necessary in order to have full access to our text-to-speech engine.

Integration

At this point, the engine is ready to be used!

One thing you can do to start is set any of your new voices to be the default voice for your Windows computer. You can do this by going in the Speech Properties section of the control panel and you’re your preferred voice to whichever NeoSpeech voice is your favorite.

You can also start accessing the voice(s) from all of your other SAPI applications on your computer! Most of your applications should already have speech settings that you can use within the application. An example of this would be the built-in Speak Command in Microsoft Word that reads out the text on a Word document.

Text to Speech SAPI

That’s the overview of our VoiceText SAPI package! If you want to use our high quality voices in your Windows applications, then this is the solution you need! Click the button below to get started.

I want VoiceText SAPI!

Don’t forget to check out the other blogs in this series as we highlight all of NeoSpeech’s text-to-speech solutions:

Learn more about NeoSpeech’s text-to-speech

Want to learn more about all the ways Text-to-Speech can be used? Visit our Text-to-Speech Areas of Application page. And check out our Text-to-Speech Products page to find the right package for any device or application.

If you’re interested in integrating Text-to-Speech technology into your product, please fill out our short Sales Inquiry form and we’ll get you all the information and tools you need.

Related Articles

New eBook, “Which Text-to-Speech Product Is Right For You?” released

Check out NeoSpeech’s Interactive Online Text-to-Speech Demo!

What is Text-to-Speech and How Does It Work?

Don’t forget to follow us on LinkedInFacebookGoogle+, and Twitter!

The post NeoSpeech SAPI Text-to-Speech Overview appeared first on Text2Speech Blog.

New VT Editor Tutorial Video

$
0
0

NeoSpeech’s popular text-to-speech application, VT Editor, has a new tutorial video! In this video, we cover all of the useful and powerful features the application comes equipped with, and we teach you how you can take advantage of them.

VT Editor let’s you take an unlimited number of words and convert them into speech. Then, you have the ability to edit any and all parts of the speech thanks to our easy to use VTML tags. Finally, you can either save the text prompts for later or download and save the speech as an audio file.

Without further ado, here’s our VT Editor Tutorial:

Like what you see? You can purchase VT Editor from NeoSpeech any time!

First, you’ll want to decide which text-to-speech voice you want. Thankfully, we have an interactive online demo that lets you test any of our 40+ voices in 15 languages.

Once you’ve picked your voice, just fill out this form to get in contact with our friendly Sales Team, who will send you the package that you need!

What do you think?

Was our tutorial video helpful? Let us know in the comments!

Learn More about NeoSpeech’s Text-to-Speech

Want to learn more about all the ways Text-to-Speech can be used? Visit our Text-to-Speech Areas of Application page. And check out our Text-to-Speech Products page to find the right package for any device or application.

If you’re interested in integrating Text-to-Speech technology into your product, please fill out our short Sales Inquiry form and we’ll get you all the information and tools you need.

Also, don’t forget to follow us on TwitterFacebook, and LinkedIn

Related Articles

NeoSpeech’s Text-to-Speech Is More Intelligent Than You Think

Text-to-Speech Features That Make You More Productive

What is Text-to-Speech and How Does It Work?

The post New VT Editor Tutorial Video appeared first on Text2Speech Blog.

NeoSpeech VT Editor Overview

$
0
0

Create high quality voice files from your computer with NeoSpeech’s VT Editor text-to-speech application.

VT Editor text to speech application

This blog is part 5 in a 5-part series highlighting each of NeoSpeech’s text to speech solutions

With NeoSpeech’s VT Editor, you can create voice prompts right from your computer. You can also save the voice files for later use. This is a great solution for eLearning applications, announcement systems, and much more.

While most of NeoSpeech’s solutions provide you with the necessary tools to build custom speech-enabled applications and products, VT Editor is unique because it already is a fully functioning, ready to go text-to-speech application. All you need to do is install it on your computer and you’ll be creating text-to-speech files in no time.

So who is this solution for? And how does it work? In this blog, we’ll tell you everything you need to know about VT Editor.

What is it?

As stated earlier, VT Editor is a text-to-speech application for your computer. It’s an application similar to Microsoft Word, in the sense that when you open it you have a blank screen where you can type or insert any amount of text.

But unlike Word, VT Editor is equipped with the NeoSpeech voice of your choice! Not only will it be able to instantly convert the text into audible speech, but it will also give you the ability to easily modify the speech by editing variables like pitch, speed, pauses, and pronunciation.  Finally, you can export and save audio files of the speech for later use.

If you want to see VT Editor in action, check out our new tutorial video below:

Who is this solution for?

VT Editor is a great solution for both personal and commercial users.

Personal users can use VT Editor for a variety of reasons. Many use it to copy and paste text from documents and online articles into the editor and then convert it into speech so they can listen to the content right away or save it for later. Some even use it to create voice prompts so they can communicate easier. There’s no telling how many useful things you can do with VT Editor.

For commercial use, having an application that’s capable of making high-quality, professional sounding voice recordings available at any moment 24 hours a day is invaluable. You can use VT Editor to create and save voice prompts for eLearning courses, IVR servers, call centers, announcement systems, and many other applications.

Another great thing about VT Editor is that it doesn’t require an internet connection to use. Most text-to-speech applications are cloud-based, meaning you have to be online to access them. VT Editor gets stored directly on your computer allowing you to use it whether or not you have an internet connection.

How does NeoSpeech optimize it for my needs?

First and foremost, we should say that VT Editor is a Windows-only application. So make sure your computer runs on a Windows operating system!

While most NeoSpeech solutions can be customized to fit your needs, VT Editor is an application that will be the same for everyone.  You can be rest assured though in knowing that each VT Editor package we send comes fully equipped with all of the powerful editing and exporting tools that make VT Editor such a great text-to-speech application.

The biggest, and pretty much only, choice you need to make is to decide which of our 40+ voices in 15 languages you want!

How does it work?

So, you’ve decided that you want VT Editor for personal use or for your business. Just follow the steps below and you’ll have VT Editor installed and making voice prompts in no time!

Purchasing process

First, you’ll need to get in touch with our Sales Team. Thankfully, we made it easy with this short form. Once you fill out the Sales Inquiry form, our team will promptly reach out to you to get all the information we need and to send you your copy of VT Editor.

Installation

Installing VT Editor is a simple process. We’ll email you with a link that starts downloading the package once clicked. Just follow the on screen instructions and choose where you want to save VT Editor within your computer.

Verification

In same email, we’ll also send you the license key that you can use to verify your copy of VT Editor, and a set of instructions an how you can do that.

Once you’ve verified your copy of VT Editor, you’ll have full access to it, so make sure you don’t skip this step!

That’s it!

From here, you can start making voice files with ease! VT Editor is a popular text-to-speech solution because it is a ready to go application. If you’re interested in purchasing VT Editor, get started now by getting in touch with our friendly Sales Team.

I want VT Editor!

Don’t forget to check out the other blogs in this series as we highlight all of NeoSpeech’s text-to-speech solutions:

Learn more about NeoSpeech’s text-to-speech

Want to learn more about all the ways Text-to-Speech can be used? Visit our Text-to-Speech Areas of Application page. And check out our Text-to-Speech Products page to find the right package for any device or application.

If you’re interested in integrating Text-to-Speech technology into your product, please fill out our short Sales Inquiry form and we’ll get you all the information and tools you need.

Related Articles

New eBook, “Which Text-to-Speech Product Is Right For You?” released

Check out NeoSpeech’s Interactive Online Text-to-Speech Demo!

What is Text-to-Speech and How Does It Work?

Don’t forget to follow us on LinkedInFacebookGoogle+, and Twitter!

The post NeoSpeech VT Editor Overview appeared first on Text2Speech Blog.

How to use NeoSpeech’s text-to-speech voices in Adobe Captivate

$
0
0

Adobe captivate text to speech

Did you know that Adobe has teamed up with NeoSpeech to enable users to add narration to their Captivate projects with NeoSpeech’s natural sounding text-to-speech voices? Take advantage of this feature and add professional narrations to your courses.

Adobe Captivate has long ago established itself as one of the premier authoring tools available today, and it’s not hard to see why. With a wide range of easy to use features, such as responsive sliders and video-audio support, Captivate makes creating eLearning content easy. Educators have been using Captivate for over 15 years to create digital courses, demonstrations, quizzes, and much more.

One of Captivate’s most used and popular features is the ability to add narration. Adding narration to eLearning content is essential to helping students focus, be engaged, and retain information.

Users can add narration by recording their own voices or by using text-to-speech. Don’t like your own voice? No microphone, audio-recording program, or time? No problem. Text-to-speech has you covered! Save time by using text-to-speech to narrate the lecture notes you’ve already put together.

By simply writing out the lines of speech and placing them in your course where you want them to go, you can add professional sounding narration thanks to NeoSpeech’s realistic text-to-speech voices. And the best part? You can do this for free!

Here are a few quick steps to get started using NeoSpeech in Adobe Captivate!

How to use Captivate’s text-to- speech feature

To use Captivate’s text-to-speech feature, go to the slide you want to add text-to-speech to and open the Slide Notes Panel. The Slide Notes Panel is accessible through the Window tab.

text to speech captivate 1

Next, click the Text-to-Speech or Closed Captioning button (The title of the button may be different depending on which version of Captivate you have).

text to speech captivate 2

Now, the Speech Management tool will pop up. Within this tool, you can write your narration. You can select the text-to-speech voice of your choice through the Speech Agent drop-down menu.

text to speech captivate 3

After you’ve selected your voice and written the narration, you can save the narration and listen to a preview (the preview feature is available in newer versions of Captivate). That’s all you need to know to start using Captivate’s text-to-speech feature!

How to use NeoSpeech’s free voices in Captivate

If you don’t see NeoSpeech’s voices within the Speech Agent menu, you’ll need to download NeoSpeech’s voice package. Adobe lets you download NeoSpeech’s text-to-speech package for free from their website! You can find all the links for the download files for each version of Captivate here: https://helpx.adobe.com/captivate/kb/captivate-text- speech-converters.html.

Once you’ve downloaded and installed the package, you will be able to use NeoSpeech’s voices in Captivate! Here are the NeoSpeech voices included in Adobe’s free package:

  • James – Male – US English
  • Paul – Male –US English
  • Julie – Female –US English
  • Kate – Female –US English
  • Bridget – Female – UK English
  • Chloe – Female – Canadian French
  • Yumi – Female – Korean

.

Again, once the NeoSpeech package is installed you’ll be able to use any of those voices in your Captivate projects!

How to use additional NeoSpeech voices in Captivate

What if you want a different NeoSpeech voice? Or maybe you need a voice in a different language? After all, NeoSpeech currently has over 40 voices in 15 languages.

(We keep adding new languages and voices. We’ll have 35 languages available in 2 years!)

If you’re interested in using an additional voice in Captivate, you’ll have to purchase the voice directly through NeoSpeech. Get in touch with our friendly Sales Team and they will tell you everything you need to know about pricing and how to install our voices for use in Captivate.

NeoSpeech in Adobe Captivate

That’s everything you need to know about using NeoSpeech’s text-to-speech voices in Adobe Captivate! We hope this guide was helpful and that you can start creating professional sounding narrations for your courses with ease.

What do you think?

Do you use Captivate? Are you interested or have you already tried using NeoSpeech voice?  Let us know in the comments!

Learn More about NeoSpeech’s Text-to-Speech

Want to learn more about all the ways Text-to-Speech can be used? Visit our Text-to-Speech Areas of Application page. And check out our Text-to-Speech Products page to find the right package for any device or application.

If you’re interested in integrating Text-to-Speech technology into your product, please fill out our short Sales Inquiry form and we’ll get you all the information and tools you need.

Related Articles

Check out NeoSpeech’s Interactive Online Text-to-Speech Demo!

How NeoSpeech is giving people their voices back

What is Text-to-Speech and How Does It Work?

 

Follow us on LinkedInFacebookGoogle+, and Twitter!

The post How to use NeoSpeech’s text-to-speech voices in Adobe Captivate appeared first on Text2Speech Blog.

Text-to-Speech for Personal Use

$
0
0

Interested in using text-to-speech just for yourself? Find out how NeoSpeech can help.

Text to speech for personal use

Text-to-speech can be a fantastic tool for personal users! Not only does it make the world more accessible to those who need it, but it can also be a very productive tool to students, workers, and just about anyone else.

NeoSpeech is known for producing the best, most natural sounding text-to-speech voices on the market today, so it’s no wonder why we receive so many requests from personal users to purchase our voices.

While we focus primarily on providing text-to-speech solutions for commercial use with packages that help developers build custom speech-enabled devices or applications, we also offer a few solutions for those just wanting to use our text-to-speech voices for themselves!

If you’re interested in text-to-speech for personal use, read on to find out everything you need to know about purchasing and using our voices.

Applications of NeoSpeech for personal users

For personal use, text-to-speech usage can basically be broken down into two categories; it either communicates information to you, or you’re using it to communicate information to others.

Most personal users of NeoSpeech are using our text-to-speech voices to have information communicated to them. They do this by taking text from online documents, articles, or web pages and entering it into our text-to-speech engine that is able to convert all of the text into speech instantly.

This is especially useful for people who have trouble reading text on a screen. People with visual or learning disabilities can often find it hard to read text, making text-to-speech an extremely helpful tool.

Some personal users enjoy text-to-speech because it increases their productivity. For example, someone may want to have their information read aloud to them while they do other tasks. Or they can save the audio files and listen to them later like when they’re driving.

We also have personal users who use our engines to communicate for them. Just recently we published the story of Debbie, a woman who uses our VT Editor program to communicate on the phone. While VT Editor wasn’t built as an Augmentative and Alternative Communication (AAC) device, Debbie uses it to type in what she wants to say and puts her phone to her computer speakers while our text-to-speech voice speaks.

There could be many reasons why you want to use text-to-speech for personal use, but the important thing to remember is that you are using it for yourself. If you are using the voices for your company, or you’re distributing the audio content produced by our text-to-speech voices (such as in a YouTube video), then you’ll need to purchase an audio distribution license.

NeoSpeech products for personal users

As mentioned earlier, most of NeoSpeech’s products are toolkits that developers can use to build their custom products and applications with our text-to-speech voices. Personal users often don’t have the time or resources to build a text-to-speech application, meaning they need one that’s already built and ready to go.

NeoSpeech’s VT Editor is exactly that. VT Editor is a Windows application that you can install on your computer and use at your pleasure. Whenever you want to convert text into speech, just type or paste the text into the application and it will convert it into speech using the NeoSpeech voice of your choice.

Similarly, NeoSpeech’s SAPI voices could also be used by personal users. SAPI is an API developed by Microsoft that enables the use of speech technology. Basically, if you were to download our SAPI voices onto your computer, you’d be able to use them in any other application on your computer as long as it is SAPI-compliant. Windows applications that are SAPI compatible include all Microsoft Office apps, Adobe Reader and Adobe Captivate.

Purchasing process for personal users

Are you interested in using our text-to-speech voices for personal use? To get started, make sure you check out our interactive online demo so you can test all of our voices and languages to make sure you find the right voice for you.

When you’re ready, fill out our short Sales Inquiry form to get in contact with our friendly Sales Team. In the “Company” section of the form, just write “Self” since you plan on using our voices just for yourself.

Once you submit your inquiry, our Sales Team will get in touch with you to get all the information they need to find the right package for you. Here are some of the important details we’ll need:

  • How you plan on using the text-to-speech.
  • What type of computer you have and what operating system it’s running on.
  • Which voice(s) you want.

Once we have all of the information we need, we’ll send you pricing and then send you your download package once we receive payment.

Important things you should know

Before you decide to purchase any of NeoSpeech’s solutions for personal use, keep the following in mind:

  • VT Editor and our NeoSpeech SAPI voices for personal use are only compatible with Windows computers. We do not offer products for Mac users.
  • When you purchase our solutions for personal use, you purchase one license that will let you use our technology. That license will only work for 1 device! You will not be able to use our VT Editor or SAPI voices on other devices. If you ever change devices, you’d have to purchase a new license. However, keep in mind that we generally do not sell multiple personal use licenses to the same person.
  • Distributing NeoSpeech voices is considered commercial use and requires a commercial license. Creating audio files to put on a video (such as a YouTube video), putting audio files on a website, hosting a presentation, creating an online audio book, all are considered commercial usage and not personal.

Learn More about NeoSpeech’s Text-to-Speech

Want to learn more about all the ways Text-to-Speech can be used? Visit our Text-to-Speech Areas of Application page. And check out our Text-to-Speech Products page to find the right package for any device or application.

If you’re interested in integrating Text-to-Speech technology into your product, please fill out our short Sales Inquiry form and we’ll get you all the information and tools you need.

Related Articles

Check out NeoSpeech’s Interactive Online Text-to-Speech Demo!

How NeoSpeech is giving people their voices back

What is Text-to-Speech and How Does It Work?

Follow us on LinkedInFacebookGoogle+, and Twitter!

The post Text-to-Speech for Personal Use appeared first on Text2Speech Blog.


What is a VTML tag?

$
0
0

VTML stands for Voice Text Markup Language and it’s a language specific to NeoSpeech’s text-to-speech software. VTML lets you modify how our text-to-speech voices read your text. You can edit the prosody (speech rhythms like speed and pitch) of our text-to-speech voices to make them sound more natural.

When we talk, we’re conveying more than just the literal meaning of the words we’re saying. We’re communicating another level of meaning with the way we speak. VTML lets you customize how our text-to-speech voices speak so you can communicate more effectively.

If you’ve worked with another markup language before, such as HTML, VTML will look similar. Both VTML and HTML are languages through which people can tell a software how to format, or in the case of text-to-speech, how to read a text prompt. If you’ve never used a markup language, we’ll break down what makes up a VTML tag.

Components of a VTML tag

Every VTML tag starts and ends with angle brackets (< >). These angle brackets let the software know that the text inside is a command for it to complete. The first angle bracket is always < and the last angle bracket is always >. Think of the angle brackets as hands that are holding the text inside together.

First, we start with an angle bracket.

     < 

Next, we let the software know which language we are using. In this case, VTML.

     <vtml

Note that when talking about VTML we capitalize it because we are using it as an acronym. When writing a tag, all text should be lowercase.

An underscore tells the software a command is coming next.

     <vtml_

For a command, we’ll do speed.

     <vtml_speed

Next, we have a space and specify a property of the command. In this case for speed, it will be value.

     <vtml_speed value

Then we choose a numeric value. For the speed property, you can choose between 50-400 with 100 being normal speed and the higher values being a faster speed.

     <vtml_speed value=”150”>

Now that you’ve written an opening tag, let’s look at when a command needs one or two tags.

Opening and closing tags

In the example above, we told our text-to-speech software to read at a faster speed, but how will our text-to-speech software know which words you want to be spoken quickly? That’s why the speed command has two tags, an opening tag and a closing tag.

We just went through what makes up an opening tag. The closing tag is the easy part as it lets our text-to-speech software know that you want it to stop applying the command. The closing tag is a backward slash followed by the name of the markup language being used and the command.

For speed, the closing tag looks like this:

     </vtml_speed>

Let’s see the speed tag in action. Say we have a sentence where we want the word “slow” to be read slowly. We place the opening tag before the word “slow” and the closing tag after the word “slow.”

     The mouse was <vtml_speed=”50”>slow</vtml_speed> compared to the snake.

The speed command has an opening tag and a closing tag so you can select the areas where you want the software to implement the command. If you’re using a command with an opening and closing tag, such as speed, pitch, and volume, don’t forget to include a closing tag or our text-to-speech software won’t execute the command.

How do you know when a command needs an opening and closing tag and when it doesn’t? Ask yourself, do you need to tell our text-to-speech software which words to apply the command to or does the command standalone? If you need to tell our text-to-speech software which words to execute the command on, like how we did with speed, then it is a command that needs an opening and closing tag. If you’re not sure, you can always check our VTML Manual.

Standalone tag

Let’s look at a command that is a single tag. For example, the break command tells our text-to-speech software to take a breath, so to speak, and is one tag. The break doesn’t change how a word is read, but determines how long the software waits before continuing to read.

Here’s the break tag.

     <vtml_break level=”3”/>

Notice how the backward slash is included at the end of the tag? The backward slash tells our text-to-speech software that the command is complete. You can’t use the break command as an opening and closing tag since it doesn’t make sense for a break to span across words.

VTML tags in VT Editor

If you’re using our VT Editor, this application makes it really easy to use VTML tags. Simply select options in the menu and VT Editor inserts the tags for you.

For instance, go to Edit. Choose Break. Select a value. VT Editor will insert the tag for you in blue.

VTMLtagScreenshot1

For a command that has two tags, like the speed command, you will first have to select the text you want modified. VT Editor will input the opening tag at the beginning of the selection and the closing tag at the end of the selection.

VTMLtagScreenshot2

How easy is that? Now you can modify the prosody of our text-to-speech voices to your liking!

Learn More about NeoSpeech’s Text-to-Speech

Want to learn more about all the ways Text-to-Speech can be used? Visit our Text-to-Speech Areas of Application page. And check out our Text-to-Speech Products page to find the right package for any device or application.

If you’re interested in integrating Text-to-Speech technology into your product, please fill out our short Sales Inquiry form and we’ll get you all the information and tools you need.

Related Articles

Check out NeoSpeech’s Interactive Online Text-to-Speech Demo!

NeoSpeech VT Editor Overview

What is Text-to-Speech and How Does It Work?

Follow us on LinkedInFacebookGoogle+, and Twitter!

The post What is a VTML tag? appeared first on Text2Speech Blog.

Text-to-Speech for Commercial Use

$
0
0

The advancement of text-to-speech technology has made a big impact in the business world in the past decade. It’s now much easier for companies and products to interact with customers. Thanks to text-to-speech, companies don’t need to staff large call centers to answer customer calls, nor do they have to hire voice actors to give a voice to their products, applications, or media.

Instead, they can just use text-to-speech.

TTScommercialUse

Once known for sounding robotic and unrealistic, text-to-speech today can rival actual human speech. Plus, technical advancements have made it much more feasible to integrate text-to-speech into just about any product.

NeoSpeech makes it even easier with our text-to-speech packages that can be customized to your specific needs. We take the hard work out of developing the perfect text-to-speech engine that fits your specifications, so you can easily integrate our high quality voices into your solution.

And when we say high quality voices, we mean it. NeoSpeech is known for producing the most realistic, natural sounding text-to-speech voices in the world.

For these reasons, NeoSpeech has been chosen by thousands of businesses looking to add a voice to their products and applications.

In this article, we’ll go over everything you need to know about NeoSpeech for Commercial Use, including the applications of our voices, our packages, and how you can get your hands on them.

Applications of NeoSpeech for commercial users
There are dozens of applications that NeoSpeech’s text-to-speech voices are used for. The versatility of our packages enables customers to use our voices just about any way they want.

Here are a few of the more popular commercial uses of text-to-speech technology:

Accessibility
Augmentative & Alternative Communication (AAC) devices and screen readers frequently feature text-to-speech capabilities to give the user the ability to speak or the ability to take in information.

Announcements
Announcement systems, weather alerts, and emergency alert systems are turning to text-to-speech. Text-to-speech is able to instantly convert any message into speech and get the message out with a clear voice. This is much more efficient than waiting for a human to record a message, especially in emergency situations when time is everything.

Audio Publishing
eBooks, online content, and even audio-enabled points of interest such as displays at museums are using text-to-speech to engage readers better than text ever could.

Education
It’s not surprising that eLearning is one of the fastest growing verticals in the text-to-speech industry. The ability to add narration to learning content has been proven to help students learn, while also making education more accessible to those with visual disabilities or literacy difficulties.

Health Care
Electronic Health Records (EHR) systems and speech-enabled medical devices use text-to-speech to improve the efficiency of the health industry.

Telecommunications
This is one of the biggest uses of text-to-speech today. With high quality voices and a smart, interactive system, companies can replace their entire call center staffs with an automated system, thanks in part to text-to-speech.

Transportation
Navigation apps on smart phones and GPS devices are one of the most well-known uses of text-to-speech technology. Automated announcements at airport, bus stops, and other travel centers are becoming popular too.

There are many more ways text-to-speech can be used. Whatever product you’re looking to build, you can have confidence that NeoSpeech can provide the voice you need.

NeoSpeech products for commercial users
NeoSpeech’s product packages are designed to fit your specific needs. Our packages come with the complete text-to-speech engines of the voice(s) of your choice. They all also come with SDKs that make the process of integrating the text-to-speech engine into your product or application simple.

Here’s a look at NeoSpeech’s solutions for commercial customers:

VoiceText TTS Engine
The perfect package for anyone looking to build a standalone speech-enabled application.

VoiceText TTS Server
Similar to the Engine package, except this package is for people who want to host their text-to-speech product on a server, which would allow you to serve multiple text-to-speech requests at once.

VoiceText Embedded
This package was developed with embedded operating systems in mind. Embedded devices and smart phone applications generally have minimal storage space, minimal computing power, and very precise specifications. Our embedded package addresses all of these issues to make implementing text-to-speech into an embedded device or app easy.

VoiceText SAPI
SAPI is Microsoft’s Speech-API. With this package, you’ll be able to use our NeoSpeech voices in any other application that is SAPI-compliant.

VT Editor
Unlike the rest of our products, this is not an engine with a toolkit that you can integrate into your product. Instead, VT Editor is a ready to use text-to-speech application! Just type in any amount of text you wish and instantly convert it into speech! VT Editor also lets you save and download your speech files.

It’s also important to note that NeoSpeech does not send out cookie-cutter versions of our packages. Instead, we’ll optimize the package we send you to fit your needs. Everything from your operating system, technical specs, usage, and customers will influence how we can customize the package to better serve you.

Purchasing process for commercial users
Does it sound like NeoSpeech has the solution you need for your commercial product? If so, the first step is to submit a Sales Inquiry form to our team.

Submit a Sales Inquiry

Then, our friendly sales team will get in touch with you to learn more about your product so we can get you the right package and optimize it to fit your needs.

Learn More about NeoSpeech’s Text-to-Speech
Want to learn more about all the ways Text-to-Speech can be used? Visit our Text-to-Speech Areas of Application page. And check out our Text-to-Speech Products page to find the right package for any device or application.

If you’re interested in integrating Text-to-Speech technology into your product, please fill out our short Sales Inquiry form and we’ll get you all the information and tools you need.

Related Articles
Check out NeoSpeech’s Interactive Online Text-to-Speech Demo!

New eBook, “Which Text-to-Speech Product Is Right For You?” Released

What is Text-to-Speech and How Does It Work?

Follow us on LinkedIn, Facebook, Google+, and Twitter!

The post Text-to-Speech for Commercial Use appeared first on Text2Speech Blog.

Top 5 VTML Tags Infographic

$
0
0

You can change the speed, volume, pitch and more of our text-to-speech voices with VTML. If you’re wondering what VTML is, check out “What is a VTML tag?” Right now, we’ll show you how to use our most popular VTML tags: break, pause, pitch, speed, and volume. Feel free to bookmark this page for quick reference on how to use these popular VTML tags.

Top 5 VTML tags infographic

Break

The VTML break tag tells the text-to-speech voice to take a breath, so to speak. The break tag is useful if you want to quickly insert a moment of silence within a sentence or want to have a longer break at the end of a paragraph.

                <vtml_break level =”3”/>

The values must be between 0-3. The 0 value is read continuously and the 3 value is a sentence break.

Pause

The VTML pause tag is like the break tag in that it is also a moment of silence. The advantage of the pause tag is that you can be precise about the length of the pause.

                <vtml_pause time=”250”/>

The value is in milliseconds and must be between 0-65535.

The break and pause tags don’t have a closing tag because they are used in spaces not on the words themselves. You can have a pause or break after a word or sentence, but not within a word. If you put a break or pause tag in the middle of a word, it will be read as two separate words. If you want a word to be said slowly, you’ll want to use the speed tag.

Pitch

The VTML pitch tag raises or lowers the pitch of the text-to-speech voice.

The human voice is capable of a range of inflections and tones. While we’re working on emotional text-to-speech voices, for now, we can use pitch to simulate emotion. For example, a high pitch could inflect excitement or nervousness, while a low pitch could be sadness. You can also change the pitch to place emphasis on a specific word or phrase.

                <vtml_pitch value=”150”>

                Insert your text here.

                </vtml_pitch>

The pitch value is between 50-200 with 50 being low and 200 high.

To use the pitch tag, the opening and closing tags have to enclose one word or more. The enclosed words will have the pitch change applied to them. The next two VTML tags, volume and speed, also have opening and closings tags and must enclose at least one word.

Speed

The VTML speed tag let’s you control how fast or slow the text-to-speech voice speaks. A slower speaking voice can be easier to understand, while a faster speed can give a lot of information in a short amount of time.

                <vtml_speed value=”75”>

                Insert your text here.

                </vtml_speed>

The value must be between 50-400 with 50 being the slowest and 400 the fastest.

Volume

The VTML volume tag controls how loudly the text-to-speech voice speaks. Along with pitch, volume is another way you can add emphasis to words.

                <vtml_volume value=”200”>

                Insert your text here.

                </vtml_volume>

The value can be between 0-500 with 500 being the loudest.

Remember to include the closing tag when using the pitch, speed, and volume tags. If you want to learn more about VTML, check out our VTML manual.

Learn More about NeoSpeech’s Text-to-Speech

Want to learn more about all the ways Text-to-Speech can be used? Visit our Text-to-Speech Areas of Application page. And check out our Text-to-Speech Products page to find the right package for any device or application.

If you’re interested in integrating Text-to-Speech technology into your product, please fill out our short Sales Inquiry form and we’ll get you all the information and tools you need.

Online Demo

Related Articles

What is a VTML tag?

NeoSpeech VT Editor Overview

Text-to-Speech for Commercial Use

Follow us on LinkedInFacebookGoogle+, and Twitter!

The post Top 5 VTML Tags Infographic appeared first on Text2Speech Blog.

Phonetic Transcription Resources for Speech Technology

$
0
0

Whether you’re doing research in speech technology or adding a text-to-speech function in an application, having a consistent method of representing sounds is important. If you’re not a linguist, it can be overwhelming to look at the many different phonetic representation methods and their charts.

We put together this guide to explain five phonetic representation methods including IPA, Extended SAMPA, Worldbet, SAPI Phoneme Representation, and CMUdict. For some of these methods, we’ve included links to phonetic converters and charts. You can use all of these phonetic representation methods with our phoneme VTML tag to fine tune the pronunciation of our text-to-speech voices.

All of the phonetic representation methods we’ll go over in this article can be used with our English text-to-speech voices. You can use j-tag with our Japanese text-to-speech voices, Pinyin with our Chinese and Taiwanese text-to-speech voices, and Jyutping with our Cantonese text-to-speech voices. Check your VTML manual for language specific phoneme tag uses.

5 Phoneme Representation Method VTML Tags

1. International Phonetic Alphabet (IPA)

In 1866, the International Phonetic Association was established in Paris. This association created the International Phonetic Alphabet (IPA) with the goal of making a standard system for the phonetic descriptions of languages. Many phonetic transcription methods are based off of IPA.

One of the challenges of using IPA in electronic communications is that a computer cannot read the symbols and the symbols can be rendered differently across devices. Nowadays, to easily use IPA symbols on the computer, we can use an IPA typing tool or an online converter. If you copy the IPA symbols to another text processer, make sure to use a Unicode font, such as Arial, so the IPA symbols display correctly.

To use IPA with the phoneme VTML tag, use the decimal number of the IPA symbol separated by semicolons.

For example, let’s have Julie pronounce “tomato” with a UK accent.

     <vtml_phoneme alphabet="ipa" ph="116;601;712;109;230;116;111;650;">tomato</vtml_phoneme>

 

2. Extended Speech Assessment Methods Phonetic Alphabet (Extended SAMPA)

With the rise of computers, linguists and speech technology researchers needed a phonetic representation method that computers could understand. While IPA was a standard phonetic system, it was only seen as graphical symbols by computers.

The Speech Assessment Methods Phonetic Alphabet (SAMPA) was created in the late 1980s under the European Strategic Program on Research in Information Technology (ESPRIT). SAMPA is an ASCII machine-readable phonetic alphabet based off of IPA. ASCII stands for American Standard Code for Information Interchange, a character encoding standard for electronic communication created in the 1960’s. SAMPA focused on European languages such as French, German, and Italian. Extended SAMPA was designed to include all IPA symbols and to cover more languages. You can use an Extended SAMPA and IPA online converter.

To use Extended SAMPA, here’s what the phoneme VTML tag looks like:

     <vtml_phoneme alphabet="x-sampa" ph="t@'meit@U">tomato</vtml_phoneme>

3. Worldbet

Like Extended SAMPA, Worldbet is an ASCII machine-readable language based off of IPA. Worldbet was designed for African, Asian, European, and Indian languages. The goal was to have unique symbols for each sound. Having one phonetic representation system with unique symbols for each sound would make it easier to study multiple languages.

When using Worldbet, here’s what the phoneme VTML tag looks like:

     <vtml_phoneme alphabet="x-worldbet" ph="t&'meitoU">tomato</vtml_phoneme>

4. Carnegie Mellon University Pronouncing Dictionary (CMUdict)

Carnegie Mellon University (CMU) is a leader in computer science research and provides many resources for speech technology. One of our founders studied at CMU. The Speech Group at CMU created the CMU Pronouncing Dictionary (CMUdict) for speech recognition research. The CMUdict online translator shows the pronunciation of American English words.

CMUdict gives “T AH0 M EY1 T OW0” as the phonetic representation of tomato, but we can edit it to give James a UK accent.

     <vtml_phoneme alphabet="x-cmu" ph="T AH0 M AE1 T OW0">tomato</vtml_phoneme>

 

5. SAPI Phoneme Representation

Microsoft’s SAPI Phoneme Representation is designed to be an easy phonetic representation method for application developers to use. SAPI Phoneme Representation is not meant for fine tune control of pronunciation or for linguistic study. Microsoft provides a simple chart for American English pronunciations.

Here’s what it looks like with the phoneme VTML tag.

     <vtml_phoneme alphabet="x-sapi" ph="h eh – l ow 1">hello</vtml_phoneme>

More Resources

With all of these phonetic representation methods, how do you choose? Each phonetic representation method has advantages and disadvantages, so it depends on what your project is. Extended SAMPA and Worldbet are the most popular phonetic representation methods for use with speech technology and electronic communication. Extended SAMPA and Worldbet also cover a range of languages. However, if you’re only working with American English, CMUdict would be sufficient. Or if you’re using SAPI text-to-speech voices, keep it simple with the SAPI Phoneme Representation method.

For a full chart comparing all five phonetic representation methods, look at Appendix A of our VTML manual.

If you’re interested in learning more about phonetic representation methods, check out these academic papers.

Speech Assessment Methods Phonetic Alphabet (SAMPA): Analysis of Urdu” by Hasan Kabir and Abdul Mannan Saleem

Computer-coding the IPA: a proposed extension of SAMPA” by J. C. Wells

ASCII Phonetic Symbols for the World’s Languages: Worldbet” by James L. Hieronymus

Learn More about NeoSpeech’s Text-to-Speech

Want to learn more about all the ways Text-to-Speech can be used? Visit our Text-to-Speech Areas of Application page. And check out our Text-to-Speech Products page to find the right package for any device or application.

If you’re interested in integrating Text-to-Speech technology into your product, please fill out our short Sales Inquiry form and we’ll get you all the information and tools you need.

Related Articles

How 4 Influential Linguists Changed History

Text-to-Speech for Commercial Use

Top 5 VTML Tags Infographic

Follow us on LinkedInFacebookGoogle+, and Twitter!

The post Phonetic Transcription Resources for Speech Technology appeared first on Text2Speech Blog.

Fine Tune Text-to-Speech Pronunciation

$
0
0

Whether you’re in the education, healthcare, or transportation industries, it is important for your text-to-speech software to sound natural and pronounce words correctly. A text-to-speech engine may not know how to say industry specific terms or molecular names. For that reason, a great text-to-speech engine must have customizable capability. Our text-to-speech engines provide this capability. You can modify the pronunciation of our text-to-speech voices with VTML tags and the User Dictionary. In this article, we’ll explain how to use our part of speech and phoneme VTML tags.

Fine Tune Text-to-Speech Pronunciation

Part of Speech

There are some words that are spelled the same, but pronounced differently and have different meanings. How is a text-to-speech engine supposed to know when you’re using record as a verb or a noun? Our text-to-speech engines are pretty smart. While our text-to-speech engines won’t understand what your text means (they’re not AIs), they do understand context. That’s why Julie will pronounce “record” correctly according to its use in a full sentence.

Here’s an example sentence: Did you record the guitar on this record?


We didn’t have to use any VTML tags, because Julie already understood that “record” was being used first as a verb and then as a noun.

In a case where our text-to-speech voices don’t have enough context to determine how to pronounce the word, we could use the partofsp VTML tag, which stands for “part of speech.” The part of speech tag is an easy way to change how a word is pronounced based on whether it is a noun, verb, adjective, and so on, without having to look at phonetic symbol charts.

For our example sentence, here’s how we would use the part of speech tag:

     Did you <vtml_partofsp part="verb">record</vtml_partofsp>  the guitar on this <vtml_partofsp part="noun">record</vtml_partofsp>?

The part of speech values you can use are: unknown, noun, verb, modifier, function, or interjection.

What do you do if the two words are the same part of speech? One example is the word “bass.” Since both the musical instrument and the fish are nouns, we can’t use the part of speech tag to determine pronunciation. That’s where our phoneme VTML tag comes in.

Phoneme

The phoneme VTML tag gives you full control over pronunciation. A phoneme is the smallest unit of speech. As long as the phoneme is recorded in the text-to-speech engine’s vocabulary, you can use it.

For the conundrum we ran into with the word “bass,” here’s how we can use the phoneme tag to modify pronunciation.

     He caught a <vtml_phoneme alphabet="x-cmu" ph="B AE1 S">bass</vtml_phoneme>.


There are five phoneme alphabets for our English text-to-speech voices, which are IPA, Extended SAMPA, Worldbet, CMUdict, and SAPI Phoneme Representation. Learn how to use each phoneme alphabet in our “Phonetic Transcription Resources for Speech Technology” guide.

Our phoneme tags are useful in determining the pronunciation of industry specific jargon, chemical names, and medications.

To learn more about all of our VTML tags and how to use them, check out our VTML manual.

Learn More about NeoSpeech’s Text-to-Speech

Want to learn more about all the ways Text-to-Speech can be used? Visit our Text-to-Speech Areas of Application page. And check out our Text-to-Speech Products page to find the right package for any device or application.

If you’re interested in integrating Text-to-Speech technology into your product, please fill out our short Sales Inquiry form and we’ll get you all the information and tools you need.

Related Articles

New Case Study on eLearning Accessibility

Text-to-Speech for Commercial Use

Top 5 VTML Tags Infographic

Follow us on LinkedInFacebookGoogle+, and Twitter!

The post Fine Tune Text-to-Speech Pronunciation appeared first on Text2Speech Blog.

User Dictionary for VT Editor and SAPI

$
0
0

You can customize the pronunciation of our natural sounding text-to-speech voices with VTML tags or the User Dictionary. The User Dictionary allows you to add and modify words to our text-to-speech engine’s vocabulary. We’ll show you how to access the User Dictionary, add an abbreviation, and modify the pronunciation of a word with International Phonetic Alphabet (IPA) symbols.

If you have terms or acronyms that you use frequently, it’s easier and faster to input the pronunciations once in User Dictionary instead of constantly editing the words with VTML tags. However, if you want a word to be pronounced a certain way one time, it’s better to use VTML tags, which you can learn how to use in our “Fine Tune Text-to-Speech Pronunciation” article.

How to access the User Dictionary

VT Editor

Open up VT Editor and click the book icon called UserDic.

Open the User Dictionary in VT Editor

SAPI

If you’re using one of our SAPI voices, you can access the User Dictionary application by going to the Program Files folder. For 32-bit SAPI, go to Program Files (x86). For 64-bit SAPI, go to Program Files. Then go through the folders VW, then VT, then the name of the text-to-speech voice, M and the sampling rate, then lib, and click on UserDicEng.

Open the User Dictionary for SAPI voices

Adding an abbreviation or acronym

In the Source field, type the acronym or abbreviation you want to have read differently. For this example, select case-sensitive because we only want “US” to be read as “United States.” We still want the word “us” to be read normally.

Add a word in the User Dictionary

The Target field is what the text-to-speech voice will read out. In the Target field, make sure Alphabet is selected as it will allow you to type regular letters so we can type in “United States.”

Select case-sensitive and Alphabet in the User Dictionary

Click Read to listen to how the text-to-speech voice will read the abbreviation. If it sounds good, click OK to add the word.

Note: When you add a word in the User Dictionary, it applies to all instances of that word.

Modifying the pronunciation of a word

To change the pronunciation of a word, you’ll have to input IPA symbols.

In the Target area, when you select Pronunciation Symbol, the window expands to show IPA symbols for vowels and consonants.

Select Pronunciation Symbol in the User Dictionary

You can now type these IPA symbols with your keyboard or select the symbols individually to insert them into the Target field. Here’s a chart of the IPA symbols you can type with your keyboard.

User Dictionary IPA Symbol Keyboard

Learn More about NeoSpeech’s Text-to-Speech

Want to learn more about all the ways Text-to-Speech can be used? Visit our Text-to-Speech Areas of Application page. And check out our Text-to-Speech Products page to find the right package for any device or application.

If you’re interested in integrating Text-to-Speech technology into your product, please fill out our short Sales Inquiry form and we’ll get you all the information and tools you need.

Related Articles

Phonetic Transcription Resources for Speech Technology

Text-to-Speech for Commercial Use

Top 5 VTML Tags Infographic

Follow us on LinkedInFacebookGoogle+, and Twitter!

The post User Dictionary for VT Editor and SAPI appeared first on Text2Speech Blog.

Viewing all 34 articles
Browse latest View live