Use PubNub: Emotional Talking Chat Do It Yourself

Surprisingly, there is still very little information about PubNub in the Russian-speaking segment of the Internet ( including Habre as well) . Meanwhile, a Californian startup founded in 2010 managed to grow into what the company itself calls the Global Data Stream Network (DSN) over the past seven years, and in fact an IaaS solution designed to meet the needs of messaging in real life. of time. Our company - Distillery - is currently one of the four development partners of PubNub , but this is not said to be a blessing for the sake of, but to share with the community the use of PubNub using the example of a demo project that was required to create in order to receive this status.

Those who can not wait to look at the code (C # + JavaScript) can immediately go to the repository on GitHub . Those who are interested in what PubNub can do, and how it works, please under the cat.

In general, PubNub offers three categories of services:
')

Realtime Messaging. An API that implements the Publish / Subscribe mechanism, followed by a ready-made global infrastructure, which includes 15 locations distributed across the globe with a stated latency of no more than 250 ms. All this is flavored with such tasty things as, for example, support for high-loaded channels, data compression and automatic banding of messages with unstable communication.
Presence. API for tracking the status of clients - from the banal status of online / offline to custom things like notifications about the set of messages.
Functions. Previously, this function was called BLOCKS, but only recently survived the rebranding (or rather, it is still experiencing it). It represents scripts written in JavaScript and spinning on PubNub servers, with the help of which you can filter, aggregate, transform data or, as we will soon see, interact with third-party services.

For all this, PubNub offers more than 70 SDKs for a wide variety of programming languages and platforms, including IoT solutions based on Arduino, RaspberryPi and even Samsung Smart TV (the full list can be found here ).

Perhaps enough theory, let's move on to practice. The test task, anticipating the receipt of partner status, is as follows: “Create a project based on PubNub using any two SDKs and the following functions: Presence, PAM and one BLOCK”. PAM stands for PubNub Access Manager and is an add-on to a security framework that allows you to control channel access at the application level, the channel itself, or a specific user. Since the assignment is formulated rather vaguely, it provides enough will of fantasy, the flight of which eventually led to the not very useful, but very interesting idea of the talking chat. And to make it more fun, the chat is not only voiced by a speech synthesizer, but also allows you to transmit verbal emotions.

Actually, the application itself is conceptually very simple - this is a two-page website. Initially, the user enters the login page, where there is no real authentication, and after entering the nickname and mode selection — full or ReadOnly — goes to the chat page. It has a “window” with channel messages, including the system a la “Vasya joined the channel”, a field for typing messages and a drop-down list with a choice of emotions. When new messages are received from other users, these messages are read out by a speech synthesizer with the emotion that the author set out when sending. To translate text to speech, use the standard BLOCK from IBM Watson , which requires minimal configuration, mainly related to the voice used. At the time of this writing, only three voices supported the emotional speech: en-US_AllisonVoice (female), en-US_LisaVoice (female) and en-US_MichaelVoice (male). A couple of months ago, only Allison was able to do it, so, as they say, progress is evident.

However, we turn to the code. The server part, and this is the beauty, balances somewhere on the edge between simplicity and primitiveness:

public class HomeController : Controller { public ActionResult Login() { return View(); } [HttpPost] public ActionResult Main(LoginDTO loginDTO) { String chatChannel = ConfigurationHelper.ChatChannel; String textToSpeechChannel = ConfigurationHelper.TextToSpeechChannel; String authKey = loginDTO.Username + DateTime.Now.Ticks.ToString(); var chatManager = new ChatManager(); if (loginDTO.ReadAccessOnly) { chatManager.GrantUserReadAccessToChannel(authKey, chatChannel); } else { chatManager.GrantUserReadWriteAccessToChannel(authKey, chatChannel); } chatManager.GrantUserReadWriteAccessToChannel(authKey, textToSpeechChannel); var authDTO = new AuthDTO() { PublishKey = ConfigurationHelper.PubNubPublishKey, SubscribeKey = ConfigurationHelper.PubNubSubscribeKey, AuthKey = authKey, Username = loginDTO.Username, ChatChannel = chatChannel, TextToSpeechChannel = textToSpeechChannel }; return View(authDTO); } }

The controller's Main method gets the DTO from the login form, extracts the channel information from the configuration data (one channel for chat, the second for communicating with IBM Watson), sets the access level by calling the appropriate methods of the object of the ChatManager class and gives all the collected information to the page. Next is the front end. For completeness, we also give the listing of the ChatManager class, which encapsulates interaction with PubNub SDK:

 public class ChatManager { private const String PRESENCE_CHANNEL_SUFFIX = "-pnpres"; private Pubnub pubnub; public ChatManager() { var pnConfiguration = new PNConfiguration(); pnConfiguration.PublishKey = ConfigurationHelper.PubNubPublishKey; pnConfiguration.SubscribeKey = ConfigurationHelper.PubNubSubscribeKey; pnConfiguration.SecretKey = ConfigurationHelper.PubNubSecretKey; pnConfiguration.Secure = true; pubnub = new Pubnub(pnConfiguration); } public void ForbidPublicAccessToChannel(String channel) { pubnub.Grant() .Channels(new String[] { channel }) .Read(false) .Write(false) .Async(new AccessGrantResult()); } public void GrantUserReadAccessToChannel(String userAuthKey, String channel) { pubnub.Grant() .Channels(new String[] { channel, channel + PRESENCE_CHANNEL_SUFFIX }) .AuthKeys(new String[] { userAuthKey }) .Read(true) .Write(false) .Async(new AccessGrantResult()); } public void GrantUserReadWriteAccessToChannel(String userAuthKey, String channel) { pubnub.Grant() .Channels(new String[] { channel, channel + PRESENCE_CHANNEL_SUFFIX }) .AuthKeys(new String[] { userAuthKey }) .Read(true) .Write(true) .Async(new AccessGrantResult()); } }

It makes sense to focus on the constant PRESENCE_CHANNEL_SUFFIX. The fact is that the Presence mechanism for its messages uses a separate channel, which, by agreement, recycles the name of the current channel with the addition of the suffix “-pnpres”. Note that the PubNub Access Manager code, expressed as a call to the Grant function, requires that you explicitly specify the Presence Channel to set access rights.

 var pubnub; var chatChannel; var textToSpeechChannel; var username; function init(publishKey, subscribeKey, authKey, username, chatChannel, textToSpeechChannel) { pubnub = new PubNub({ publishKey: publishKey, subscribeKey: subscribeKey, authKey: authKey, uuid: username }); this.username = username; this.chatChannel = chatChannel; this.textToSpeechChannel = textToSpeechChannel; addListener(); subscribe(); }

The first thing we have to do in the JavaScript code is to initialize the corresponding SDK. For convenience and simplicity, some entities are in the global variables. After initialization, you need to add a listener for events of interest to us and subscribe to the chat channels, Presence and IBM Watson. Let's start with a subscription:

 function subscribe() { pubnub.subscribe({ channels: [chatChannel, textToSpeechChannel], withPresence: true }); }

If the code of the subscribe method speaks for itself, then with the addListener method everything is a bit more complicated:

 function addListener() { pubnub.addListener({ status: function (statusEvent) { if (statusEvent.category === "PNConnectedCategory") { getOnlineUsers(); } }, message: function (message) { if (message.channel === chatChannel) { var jsonMessage = JSON.parse(message.message); var chat = document.getElementById("chat"); if (chat.value !== "") { chat.value = chat.value + "\n"; chat.scrollTop = chat.scrollHeight; } chat.value = chat.value + jsonMessage.Username + ": " + jsonMessage.Message; } else if (message.channel === textToSpeechChannel) { if (message.publisher !== username) { var audio = new Audio(message.message.speech); audio.play(); } } }, presence: function (presenceEvent) { if (presenceEvent.channel === chatChannel) { if (presenceEvent.action === 'join') { if (!UserIsOnTheList(presenceEvent.uuid)) { AddUserToList(presenceEvent.uuid); } PutStatusToChat(presenceEvent.uuid, "joins the channel"); } else if (presenceEvent.action === 'timeout') { if (UserIsOnTheList(presenceEvent.uuid)) { RemoveUserFromList(presenceEvent.uuid); } PutStatusToChat(presenceEvent.uuid, "was disconnected due to timeout"); } } } }); }

First, we subscribe to the “PNConnectedCategory” event to catch the moment when the current user joins the channel. This is important because the receipt and display of a list of all participants need to be called only once, while the Presence event of the “join” is triggered every time a new client is joined. Secondly, when capturing a new message event, we check the channel to which this event is addressed, and depending on the result of the check, we either create a textual representation by means of a banal concatenation, or initialize the Audio object using the link to the audio file from IBM Watson and start playing .

Another interesting thing happens when sending a message:

 function publish(message) { var jsonMessage = { "Username": username, "Message": message }; var publishConfig = { channel: chatChannel, message: JSON.stringify(jsonMessage) }; pubnub.publish(publishConfig); var emotedText = '<speak>'; var selectedEmotion = iconSelect.getSelectedValue(); if (selectedEmotion !== "") { emotedText += '<express-as type="' + selectedEmotion + '">'; } emotedText += message; if (selectedEmotion !== "") { emotedText += '</express-as>'; } emotedText += '</speak>'; jsonMessage = { "text": emotedText }; publishConfig = { channel: textToSpeechChannel, message: jsonMessage }; pubnub.publish(publishConfig); }

First we form the message itself, then we define the configuration that the SDK understands, and only after that we initiate the sending. Further better. To turn the text into synthesized speech, we send another message to the IBM Watson channel. Speech Synthesis Markup Language (SSML) is used to define emotional coloring, and more specifically, the <express-as> tag. As you probably already guess, when sending a message to a ReadOnly user, it will be blocked by the PAM mechanism and will never find its recipient.

Among the products already on the market that use PubNub's capabilities are, say, the concept of Insteon's smart home or a mobile application for planning family events from Curago . In conclusion, let me remind you once again that the full code of the example can be found on GitHub .

Source: https://habr.com/ru/post/338720/

All Articles

Use PubNub: Emotional Talking Chat Do It Yourself

More articles: