Make the conversation in a Slack call visible in Slack.
This article is also available here.(Japanese)
Over the past few years, many companies have started to introduce remote work. I believe that these companies are using Slack and various other communication tools to streamline their operations. However, there still seems to be a situation where conversations and chats are lacking and various problems are occurring.
会話と雑談の不足で7割以上が心的不調 | MANA-Biz
On the other hand, there seems to be a certain number of people who don’t like the office environment where chatting occurs spontaneously at any time. When you want to concentrate on your work, it can be annoying to hear people chatting about something interesting nearby.
However, I would still like to be a part of chats where people are having fun or talking about things that interest me. Perhaps there are times when the level of “interesting content” exceeds the level of “wanting to concentrate on work,” and being able to consciously participate in those times is what makes for a stress-free environment.
By the way, in my office, we have a Slack channel for chatting, and we often use Slack calls to chat. When I was working in the office, I could hear what was going on and if it sounded interesting, I could join in. But with Slack calls, you don’t know what’s going on, so people like me who are reclusive hesitate to join in. If the conversation in the Slack calls be visible, there might be more opportunities for these people to join the chats. Also, in an office, it is difficult to completely shut out the chatter unless you have earplugs. However, if we can make the content of Slack calls visible in Slack, we can concentrate on our work without being distracted by chatter unless we intentionally open a channel for chatter.
That’s why I’ve created a Slack App called “speech-to-text-chat” that displays the contents of a Slack call in the channel.
In practice, the content of a Slack call will be posted as text, as shown below.
The figure below shows a rough configuration of speech-to-text-chat.
There are four major components that make up a speech-to-text-chat.
(1) A browser tab that displays the Slack workspace and channel. (Slack tab.)
(2) A browser tab for Slack calls that are launched from Slack ( Slack calls tab)
(3) Browser tab for transcribing microphone input (transcribe tab).
(4) A Slack App that receives various Slack events and text from (3) and posts messages to the Slack channel. (called a speech-to-text-chat server)
The arrows in the figure indicate the flow of data, but the flow of data to the Slack tab is actually the flow of data to the back-end Slack server. The Slack server has been omitted to simplify the figure.
The newly created components are (3) transcribe tab and (4) speech-to-text-chat server.
A more detailed behavior is as follows. Again, the Slack tab refers to the Slack server.
When a user initiates a Slack call, another tab is opened in the browser and the Slack call is started.
This event triggers the speech-to-text-chat server, which displays a button (Join button) in Slack to start texting the conversation.
When a user clicks on the Join button, the speech-to-text-chat server will create a URL encoded with the user information and other information contained in the event, and display it on Slack. When the user opens this URL, the Transcribe tab will open.
From now on, the user’s voice will be sent to both the Slack call tab and the transcribe tab; we haven’t made any changes to the behavior of the Slack call tab. transcribe tab converts the voice into text using the browser’s built-in speech recognition API. This text will be sent to the speech-to-text-chat server, which will receive the text and display it in the Slack tab.
The above is the general structure and operation.
For the detailed implementation, please refer to the code in the repository described below.
Here, I will explain how to use the speech-to-text-chat server that I have already uploaded to heroku.
The content of the channel will be sent to the speech-to-text-chat server I have set up. Please refrain from using it in channels where confidential information is discussed. The readme of the repository described below shows how to upload to heroku by yourself. Please use that method.
Also, the accuracy of voice recognition is not 100% accurate. In some cases, it may not be able to recognize the words properly, or it may convert them into words that you may find offensive. Please be forewarned.
Click on the link below. You will see the Slack App installation screen as shown below.
Check the scopes to be used and click “Allow” if there are no problems. When the following screen appears, you are OK.
You can also select the workspace to be installed from the pull-down menu in the upper right corner.
Next, login to Slack and invite the channel where you want to use speech-to-text-chat by running `@speech-to-text-chat`.
From now on, when you start a Slack call on the channel where you have added speech-to-text-chat, the following message will appear and you can click the join button.
When the following message appears, click on the opentTab.
When a new tab opens and the word Listening appears on the screen, you are ready to go.
Try saying something, and slack will show what you said.
As you can see, speech-to-text-chat works independently of slack calls. This means that you can use speech-to-text-chat without using slack calls. You can start it with the slash command /speech-to-text-chat <room-name>. This also allows you to use it with another video conferencing tool such as Zoom or Teams.
The source code for speech-to-text-chat described in this blog is available in this repository.
If you want to deploy it in your environment, please refer to the readme here.
GitHub - w-okada/speech-to-text-chat
Slack App to transcribe a call and post it to slack channel. The content of the channel will be sent to a…
In this article, I introduced an application that converts conversations in Slack calls into text and posts them to Slack.
The original purpose of this app was to attract users who have not yet joined a Slack call. In addition to simply converting conversations into text, I’m thinking of developing a summary that would be interesting to users in a nice way. If you have any good ideas, I would be grateful for your comments.
In no event shall we be liable for any direct, indirect, consequential, or special damages arising out of the use or inability to use the software on this blog.