NeoSpeech’s Text-to-Speech Embedded SDK Overview

July 26, 2017, 7:32 pm

≫ Next: NeoSpeech Is Working On Singing Text-to-Speech Voices

≪ Previous: Check out NeoSpeech’s Interactive Online Text-to-Speech Demo!

Embed your mobile app or device with NeoSpeech’s natural sounding text-to-speech voices.

Embedded Text to Speech

This blog is part 3 of a 7-part blog series highlighting each of NeoSpeech’s text-to-speech solutions

Giving your smart phone application or mobile device a voice shouldn’t be difficult, which is why we made it easy.

NeoSpeech’s VoiceText Embedded SDK package was specifically designed with embedded operating systems in mind. Mobile apps and embedded devices run on very specific operating systems, have limited storage space, and can have a low amount of processing power. Finding a high quality text-to-speech engine that can be compatible with an embedded product’s specifications has long been a challenge for developers.

Fortunately, our Embedded SDK package is the solution that easily overcomes this challenge.

So who is this solution for? What does it come with? And how does it work? Let’s take a look at everything you should know about our VoiceText Embedded SDK package.

What is it?

In terms of what the package is composed it, our Embedded package is nearly identical to our Engine package. When you purchase our VoiceText Embedded SDK, we send you the complete text-to-speech engine and SDK.

Text-to-Speech Engine

This is what makes the magic happen! It’s what gives your product a voice! The text-to-speech engine, or TTS engine for short, is able to analyze inputted text and convert it into speech. The method of how it puts together the speech depends on the type of engine you get.

Either way, with the TTS engine in hand, you’ll be able to convert any amount of text into high quality speech instantly!

SDK

The software development kit (SDK) is what will enable you to easily integrate the TTS engine into your custom application or device. An SDK is defined as a set of tools that a software developer uses to help build applications.

So if integrating a powerful TTS engine into an embedded device sounds daunting, don’t sweat it. Our SDK (which includes a very informative manual on how to use it) makes the whole process simple and straight forward.

Who is this solution for?

If we had to sum it up in one sentence, we’d say that the VoiceText Embedded SDK package is for people wanting to give their embedded device or application a voice by hosting the TTS engine locally.

First, let’s define what an embedded device and an application are.

An embedded device is a device that contains a special computing system for a dedicated purpose. These aren’t general-purpose computers like your laptop that can do anything. Instead, embedded devices usually have one or a few specific purposes.

An example is a blood glucose monitor. This is a device that tests the levels of glucose in your blood. People suffering from diabetes often use blood glucose monitors to make sure their blood-sugar levels aren’t too low or too high.

Blood glucose monitors have a specific purpose. They test blood samples and display back the results. They aren’t general-purpose computers that can perform most common tasks. They are embedded devices. Other examples include ATMs, DVD players, smart home devices, and a GPS device.

An application, at least in this context, refers to an application for a mobile device. If you have a smartphone, then you probably have dozens of apps downloaded onto your device. Our VoiceText Embedded SDK is perfect for integrating text-to-speech into smartphone apps.

Now that you know what we mean by “embedded device or application”, you should have a pretty good idea if the Embedded SDK package is right for you. Is the product you’re developing going to be an embedded device? Or is it going to be an app for a smartphone? If you answered yes to either of these two questions, then our Embedded SDK package is probably right for you.

Note that in our one sentence summary of who the Embedded SDK package is for, we mentioned that it’s for those who want to host their TTS engine locally. Hosting the TTS engine locally means that the engine is located and stored within the device or app itself. The other option would be host your TTS engine either in your own server or through NeoSpeech’s cloud.

How does NeoSpeech optimize it for my needs?

As mentioned above, embedded devices and applications have very unique operating systems, specifications, and storage space limitations. Therefore, we at NeoSpeech make sure to optimize our Embedded SDK package to fit your needs.

Our Embedded SDK package works with several operating systems, including iOS, Android, Embedded Linux, WinCE, and Windows Mobile. We can also custom port our engine to your specific operating system if we don’t already support it.

You can also choose your desired sampling rate (which affects sound quality) and footprint size. Our embedded TTS engines can range anywhere from 8MB to 900MB. Which footprint size and sampling rate you need depends on your device specifications and the application for which the TTS engine is being used for.

How does it work?

Here is a look at how you can use our VoiceText Embedded SDK package to integrate text-to-speech into your embedded device or app.

Purchasing process

There are no mind-bending hoops you have to jump through to get your hands on our Embedded SDK package. Just get in touch with our Sales Team. We’ll take it from there and make sure we get all the information we need so we optimize the package to your needs.

Installation

After purchasing the package, we’ll send you the complete TTS engine and SDK for download. You or your developer will need to download these files on the computer you’re using to develop your product.

Verification

Before you can do anything, you have to verify your software. In the same email we send you with the download files, we’ll send you a license key that’ll activate and unlock your TTS engine.

Integration

Here’s where the fun begins. At this point it’s time to integrate the TTS engine into your custom application or device.

To get started, just take a look at the SDK manual that comes with the SDK package. In there you’ll find clear, easy to follow, step-by-step directions on what to do.

Basically, you’ll just need to copy the TTS engine and a few files from the SDK into your device or application. Then, all you need to do is write out the commands that’ll be the blueprint that tells your product how and when to use the TTS engine to create speech. That’s it!

Our SDK manual will go over all the commands you could use and how they work. With that in hand, you’ll give your embedded product an engaging, natural sounding voice in no time.

See? We told you it was easy.

NeoSpeech's embedded text-to-speech works for smartphone apps

That’s the overview of NeoSpeech’s VoiceText Embedded SDK solution! If you’ve decided that this is the solution for you, click the button below to get started!

I want the VoiceText Embedded package!

Be sure to check out the other blogs in this series to learn more about our other solutions. And if you need help figuring out which solution is right for you, read our free eBook on the matter.

Here’s the list of blogs in this series covering NeoSpeech’s solutions:

VoiceText TTS Engine SDK
VoiceText TTS Server SDK
VoiceText Embedded SDK
VoiceText Editor (VT Editor)
VoiceText SAPI
Web Service (cloud-based solution)
TTS On Demand (cloud-based solution)

Learn more about NeoSpeech’s text-to-speech

Want to learn more about all the ways Text-to-Speech can be used? Visit our Text-to-Speech Areas of Application page. And check out our Text-to-Speech Products page to find the right package for any device or application.

If you’re interested in integrating Text-to-Speech technology into your product, please fill out our short Sales Inquiry form and we’ll get you all the information and tools you need.

New eBook, “Which Text-to-Speech Product Is Right For You?” released

Check out NeoSpeech’s Interactive Online Text-to-Speech Demo!

What is Text-to-Speech and How Does It Work?

Don’t forget to follow us on LinkedIn, Facebook, Google+, and Twitter!

The post NeoSpeech’s Text-to-Speech Embedded SDK Overview appeared first on Text2Speech Blog.

↧

NeoSpeech Is Working On Singing Text-to-Speech Voices

August 7, 2017, 10:32 am

≫ Next: How to Integrate Text-to-Speech in the Internet of Things

≪ Previous: NeoSpeech’s Text-to-Speech Embedded SDK Overview

NeoSpeech singing text to speech

Text-to-speech has become a mainstream technology due to its power to make speech more accessible. It gives both humans and machines the ability to speak. By receiving text prompts, a text-to-speech engine is able to instantly synthesize speech that can sound as realistic as an actual human.

However, there’s so much more that humans can do with their voices that text-to-speech hasn’t been able to replicate in the past. One such example is singing. People can use their voices to produce musical tones while speaking. Modern text-to-speech engines do not have the ability to replicate musical tones.

Now, we at NeoSpeech are working on taking our natural sounding text-to-speech voices and giving them the ability to sing beautiful melodies. Much like how text-to-speech made the ability to speak more accessible, we want to make the ability to sing more accessible.

Want to hear it for yourself? Here’s an example of what our Japanese singing text-to-speech voice sounds like:

How Singing Text-to-Speech Works

Singing text-to-speech isn’t necessarily new. It’s actually been around for a while. In fact, last year we covered the story of Hatsune Miku, the text-to-speech pop star.

Vocaloid computer programs like the one that lets users create songs by the persona Hatsune Miku can synthesize singing. Each program is created by having a voice actor record phonemes in different pitches. Then, using the computer program, users can put these phonemes together to create lyrics and melodies.

The problem with these programs is that they’re usually limited to just one voice and that they don’t sound human-like.

NeoSpeech is looking to create more natural sounding singing text-to-speech voices through the use of HMM-based text-to-speech.

As discussed in our article, “Which Speech Synthesis Technique is Better?”, the HMM method is a statistical parametric synthesis technique that generates speech by taking recordings from a voice actor and modifying it to sound as similar as possible to the inputted text.

Basically, HMM-based text-to-speech engines don’t fully preserve the voice of the original voice actor when generating speech. It is able to modify the speech to sound the way the engine believes it is supposed to sound.

This flexibility makes HMM-based text-to-speech engines perfect for creating singing voices. It enables the engine to create high and low notes to transform normal words and sentences into songs.

With the engine’s ability to modify the voice in place, the last step is giving the text-to-speech engine knowledge of the music. Much like how a regular text-to-speech engine needs to know the text, a singing text-to-speech engine will also need to have knowledge of the musical structure.

There are several ways this can be achieved. One way is to input musical data into the singing text-to-speech engine with a musical score data sheet from a program like MuseScore. MuseScore is a simple tool that lets you build musical scores (like the one below) on your computer.

musescore sheet

Once the HMM-based text-to-speech engine receives the musical data and the text (or lyrics), it can generate the audio file!

Like we said though, that’s just one way out of many that singing text-to-speech can be achieved. We at NeoSpeech have been experimenting the best ways to create a singing text-to-speech voice. While there’s still some time till we perfect the process, we can happily say that we’ve already had some great results!

NeoSpeech’s Singing Text-to-Speech Samples

Using the same voice in the above example, we gave our singing Japanese text-to-speech voice the musical data and lyrics to Jingle Bells (in Japanese, of course)! Here’s a sample of the synthesized singing:

Then, we threw in some background music to truly make it sound like a professional song recording:

Pretty cool, right? For now, we at NeoSpeech have only worked with our Japanese voices on singing text-to-speech. Don’t worry though, we’re not forgetting about our other languages!

listening to NeoSpeech synthesized singing

Singing text-to-speech is still in development, but NeoSpeech is working hard to push the boundaries of what text-to-speech is capable of and we’re hoping to make the ability to sing accessible to all in the near future!

What do you think?

Did you enjoy our singing text-to-speech examples? What are you most excited about using singing text-to-speech for? Let us know in the comments!

Learn More about NeoSpeech’s Text-to-Speech

If you’re interested in integrating Text-to-Speech technology into your product, please fill out our short Sales Inquiry form and we’ll get you all the information and tools you need.

Check out NeoSpeech’s Interactive Online Text-to-Speech Demo!

Learn how to enable NeoSpeech’s Text-to-Speech voices in Microsoft Word

What is Text-to-Speech and How Does It Work?

The post NeoSpeech Is Working On Singing Text-to-Speech Voices appeared first on Text2Speech Blog.

↧

How to Integrate Text-to-Speech in the Internet of Things

August 22, 2017, 3:02 pm

≫ Next: NeoSpeech SAPI Text-to-Speech Overview

≪ Previous: NeoSpeech Is Working On Singing Text-to-Speech Voices

Want to give your IoT product a voice? Still not convinced that text-to-speech is a must have feature for an IoT device? Keep on reading to find all the answers you need.

Text to speech in the internet of things

If you’re familiar with the Internet of Things (IoT), then you already know that IoT is all about being connected, sharing information, and communication. IoT separates great devices from ordinary ones because of the value it can provide to the user. (If you’re not familiar with IoT yet read our article on it here!)

Text-to-Speech technology can be viewed the same way. Text-to-speech makes interactions between devices and humans more natural and engaging. You could say that text-to-speech is also about being connected, sharing information, and communication.

It’s no wonder that many of today’s IoT devices feature text-to-speech. If current trends continue, almost all IoT devices will feature text-to-speech in the next few years.

So how does a manufacturer go about integrating text-to-speech into an IoT device? After all, these devices usually have limited computing power and storage space, and must work in real-time.

Thankfully, it can be a lot easier than you think with NeoSpeech. Whether you want to host a text-to-speech engine on your device, in our cloud, or in your own cloud, we have the solutions that can make it happen without a hassle.

So which option is right for you? And how can we make that happen? Keep on reading to find your answers.

Why does IoT need a voice?

In case you’re not convinced yet that an IoT device should have a text-to-speech voice, hear us out.

Our phones, alarm clocks, thermostats, appliances, and just about any other device that can connect to the internet is a part of the Internet of Things. These devices use their ability to connect to the internet to communicate information to other devices and people.

This makes IoT devices much smarter than regular devices. Your thermostat at home can record that the temperature in your house is 76 degrees and communicate it to your car as you’re driving home. After your car relays the message to you, you can tell your car to set the thermostat to 68 degrees. Your car would then communicate that to the thermostat which would set the new temperature.

These two IoT devices were able to communicate and share information with each other, which makes both devices more accessible, usable, and valuable.

IoT is all about communicating and sharing information. Therefore, shouldn’t your IoT device be able to communicate and share information with its user in the most natural, humanlike way possible?

Text-to-speech enables your devices to talk to your users. This makes the device appear smarter. It also makes it much easier to use. Your users would be able to perform other tasks, such as driving or cooking, while hearing information from whatever IoT device they’re using.

Without text-to-speech, your IoT device would require a screen that users would have to focus on to read whatever message it has. Frankly, this is starting to become outdated.

After all, if an IoT device is able to communicate efficiently with other devices, shouldn’t it be able to communicate with people as efficiently as possible too?

How to give IoT a voice

In order to have a voice, you need a text-to-speech engine. The engine is where the database of recordings from the voice actor is stored. It’s what is able to convert inputted text into audible speech instantly.

When integrating text-to-speech into your IoT product, the first thing you need to decide is where to store the text-to-speech engine.

The engine can be installed locally on your device, within an IoT hub, in your cloud, or in NeoSpeech’s own cloud.

Stored locally

If your device has a sufficient amount of computing power and storage space, and/or if your product won’t always be connected to the internet, then you want the engine to be installed locally on the device.

Some IoT devices are meant to go wherever the user goes. Many times the user will go where there is no internet connection. If that’s the case, then the IoT device must be able to work without the internet.

By storing the engine within the device, you can ensure that the text-to-speech function will always work. Whenever the device needs to convert a string of text into speech, it’ll send the request to the engine stored in the device, instantly convert it into a speech output, and then play the speech.

To make this happen, all you need is NeoSpeech’s VoiceText Embedded SDK package. This package will come optimized for your specific embedded device system requirements (and our embedded engines have small footprints so they don’t take up too much memory). You’ll get the text-to-speech engine, which you would store in your device, and then an SDK which will make it very easy for you to program your device to interact with the engine exactly how you want it.

Stored in an IoT hub

This solution is perfect for those making a suite of smart home products that all connect to one hub.

A smart home hub is a device that connects and controls all of your other smart home devices. Amazon Echo can be thought of as a hub, as you can use it to control your other smart home devices (such as telling the Echo to turn off your upstairs lights).

Smart home hubs usually have more processing power and storage space than other IoT devices, which makes it easier to store the engine in it.

When the smart home hub receives a text-to-speech request from end-device, it can convert it into speech and then either play the speech through the speakers in the hub, or send the audio file back to the end-device which then plays the audio.

Either way, we have the text-to-speech packages that will make it easy for you to build.

And what makes this solution great is that the devices don’t need to be able to connect to the internet to for the text-to-speech conversions, it only needs to be able to connect to the smart home hub through a local network connection.

In your cloud

If your IoT device functions by always having a connection to your company’s server(s), then we have good news for you! Our VoiceText TTS Server SDK package lets you integrate our text-to-speech engine into your server.

This process is just as simple as the ones mentioned above. All you need to do is install our engine and Server SDK in your server, and then use our Server SDK to program how your product will use the engine and when.

Then, whenever your IoT device needs to make speech, it will just send the request to your server, which will then instantly send back the audio file of the speech to play.

This enables you to free up limited memory space and processing power on your IoT device, while still allowing it to perform text-to-speech conversions in real-time

In NeoSpeech’s cloud

Did the option above sound perfect for you, except for the part where you don’t have a server? Don’t worry, you can just connect all of your IoT devices to NeoSpeech’s cloud through our Web Service platform.

For this, we’ll just send you the API your programmer needs to allow your IoT devices to connect to our text-to-speech server.

Once it’s set up, whenever your IoT device needs to say something, it’ll send the request to our cloud which will send back the speech.

Using NeoSpeech for IoT

Text-to-speech and IoT can be a match made in heaven. Giving your IoT device a voice will make it easier to use and increase customer satisfaction. Plus, with NeoSpeech’s suite of text-to-speech solutions, it can be very easy to integrate text-to-speech, just give us a call and we’ll help make it happen!

What do you think?

Do you think all IoT devices should feature text-to-speech? How do you see speech technology and IoT growing in the next few years? Let us know in the comments!

Learn More about NeoSpeech’s Text-to-Speech

If you’re interested in integrating Text-to-Speech technology into your product, please fill out our short Sales Inquiry form and we’ll get you all the information and tools you need.

Check out NeoSpeech’s Interactive Online Text-to-Speech Demo!

Intelligent Virtual Assistants Continue to Choose NeoSpeech’s Text-to-Speech Voices

What is Text-to-Speech and How Does It Work?

Don’t forget to follow us on LinkedIn, Facebook, Google+, and Twitter!

The post How to Integrate Text-to-Speech in the Internet of Things appeared first on Text2Speech Blog.

↧

NeoSpeech SAPI Text-to-Speech Overview

August 30, 2017, 6:02 pm

≫ Next: New VT Editor Tutorial Video

≪ Previous: How to Integrate Text-to-Speech in the Internet of Things

Are you in need of high quality text-to-speech voices that are compatible with any SAPI compliant application? Read on to find out how NeoSpeech makes it easy.

NeoSpeech SAPI text to speech

This blog is part 4 of a 7-part series highlighting each of NeoSpeech’s text-to-speech solutions.

SAPI, or Speech Application Programming Interface, is an API developed by Microsoft that allows the use of speech technology in Windows applications. SAPI can be thought of as a set of rules and protocols that developers follow to integrate speech functions, such as text-to-speech, into their SAPI compliant application.

For this reason, SAPI is well-known in the speech technology world because it is accessible to all, easy to follow, and used by hundreds of the most popular Windows applications today.

This is why we at NeoSpeech released our VoiceText SAPI product a long time ago. Our SAPI text-to-speech voices sound as natural and realistic as NeoSpeech’s voices have always been known to sound. However, these voices are special because they were developed with SAPI’s specifications and protocols in mind.

So what does that mean? It means that by using our VoiceText SAPI, you can integrate any of NeoSpeech’s voices into any Windows application that uses SAPI! It’s simple, easy, and produces great results!

Let’s take a deeper look at how you can give your Windows application a lifelike voice with NeoSpeech’s VoiceText SAPI.

What is SAPI?

If you’re still scratching your head and not too sure what SAPI is, don’t worry, this short explanation should clear that up.

An API (Application Programming Interface) is a set of rules, protocols, and tools that specify how software components should interact with each other. This video does a fantastic job explaining what exactly APIs do. It enables applications to communicate with each other.

Let’s say you’re on your smart phone and you look up the address of a movie theater in your browser app. You want to find out how to drive there, so you click the address and it automatically opens up your Maps app with the address already put in it. In this case, your browser app and the Maps app were able to communicate with each other thanks to an API that defined how the two apps would talk to each other.

SAPI is a speech-specific API (hence adding the “S” in front of “API”) that enables the use of speech recognition and speech synthesis within Windows applications.

So if you download a text-to-speech voice on your computer that is SAPI compliant, you can follow the protocols set by SAPI and write the code that will tell your Windows applications how to use the text-to-speech voice. Once you’re done, your Windows application will use the voice in the exact way that you programmed it to use the voice.

Sounds easy enough, right? Even better is that many SAPI applications already have text-to-speech functionality built into them, so all you need to do is download and install the SAPI voices and you’re ready to go! No coding required!

Here is an example of a few Windows applications that are SAPI compatible:

Microsoft Office apps (Excel, Word, etc.)
Microsoft Narrator
Microsoft Voice Command
Adobe Reader
Adobe Captivate
JAWS
Windows-Eyes
CoolSpeech
NonVisual Desktop Access

Any application that is SAPI compliant can use any of NeoSpeech’s voices thanks to our VoiceText SAPI.

What is NeoSpeech’s VoiceText SAPI?

NeoSpeech’s VoiceText SAPI isn’t much different from our Engine, Server, or Embedded SDK packages. With all of those packages, you download our text-to-speech engine(s) and the SDK which gives you direction on how to build an application using our voices.

Our VoiceText SAPI product is essentially a separate version of our Engine, Server, or Embedded package which was developed to be compatible with SAPI. So, with our VoiceText SAPI, you can integrate NeoSpeech’s voices into any engine, server, or embedded SAPI application!

Who is this solution for?

The Engine, Server, and Embedded applications mentioned earlier are for people who want to build custom applications. VoiceText SAPI is for people who want to simply integrate our voices into already existing Windows applications (as long and they’re SAPI compliant), or for people who want to build their own custom Windows application that is SAPI compliant.

Do you want to use Microsoft Word to create voice prompts? Do you have a Windows Server application for your call center and you want to integrate our text-to-speech voices into it? Are you using an authoring tool or eLearning application that could use a better sounding voice? If so, then VoiceText SAPI is what you need!

How does NeoSpeech optimize it for my needs?

We’ll make you to send you the right VoiceText SAPI package for your needs. We have SAPI versions of our Engine, Server, and Embedded SDK packages. Whether you need the Engine, Server, or Embedded version depends on the type of device and application you’re working with.

All of our voices have 32bit and 64bit SAPI versions. The kind you need depends on your operating system.

To make all of this simple, just tell our sales team what kind of computer you’re using, what voices you want, and what application you want to integrate them into. We’ll make sure the package we send you meets all of your specifications.