open source – Sean Novak https://snovak.com build in public Tue, 03 Jan 2023 14:29:45 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.1 https://snovak.com/wp-content/uploads/2022/12/cropped-me-32x32.png open source – Sean Novak https://snovak.com 32 32 ChatGPT – Scaffolding a Nextcloud Plugin https://snovak.com/2023/01/chatgpt-scaffolding-nextcloud-plugin/ Tue, 03 Jan 2023 13:00:00 +0000 https://snovak.com/?p=141 Continue reading "ChatGPT – Scaffolding a Nextcloud Plugin"

]]>
🤯

I’m continually impressed by ChatGPT. This morning I thought it would be really nice to be able to track my health statistics on Nextcloud, my private cloud that I have running just behind me in my closet. What a cool little project to give to ChatGPT and see how quickly we can get something up and running. It’s 8am on a Tuesday morning, I’m back to work on my day job, but I have about an hour to fiddle around with it. Let’s see how quickly ChatGPT can get this started….

A little background, I’ve been tracking some health parameters for a while with iHealth, mainly because they’ve made it easy to do so. I have a bluetooth bloodpressure cuff, every time I take my BP, it’s logged to the cloud. It has a nice UI. But, I’m not very happy with giving my health information away anymore. So I’ve been looking for a new home for my health data. Lately, I’ve been using “Waistline”, an open source app found on F-Droid. It works, but not nearly as nicely as iHealth. The data is siloed, and I’m not really sure how to get it out of the app. So, passively I’m still looking. That’s were we pick up the story for this idea.

Here is the chat in it’s entirety. I basically walk the bot through the process of coding the entire plugin for me.

At this point, I have yet to test it out, but as you can see, it’s an amazing start. I’ve got a plugin templated out, an API, and directions to get the frontend started as well. It’s 9am now, so I need to get to my day job. But, wow. Just WOW.

More to come as time permits.

]]>
141
Vosk on-device Speech-to-text https://snovak.com/2022/12/vosk-on-device-speech-to-text/ Wed, 28 Dec 2022 13:37:55 +0000 https://snovak.com/?p=108 Continue reading "Vosk on-device Speech-to-text"

]]>
Since I’ve started using GrapheneOS, a deGoogled Android build, I’ve missed several services you typically get from Apple or Google on my device, one of those core services is Speech-to-Text. It helps a lot to speed up note taking, writing text messages, etc.

I’ve been using a very crude Vosk keyboard on Android to fill the gap. I’d love to try to improve upon this project, but for today I’m interested in getting this functionality in Gnome, my Desktop of choice on Ubuntu Linux. This is not meant to be a tutorial, but more of a journal entry.

Documentation for gnome extension are scant. Here is what I could find:

Here is a great playlist on YouTube to get more familiar with creating gnome extensions.

Damn, don’t you hate when you don’t save your work? I just lost a bunch of work. DOH!

Let’s see, I was astonished to see that, in general, the Gnome extensions area is not super active.

Development is a little rough, I have to switch from Wayland to X11, which makes reloading extensions a little easier. In wayland, you have to log out and back in for extensions to refresh. Yikes.

Here’s a directory of existing extensions: https://extensions.gnome.org/

I like to learn from other code. So I installed this extension, which allows you to manage your system clipboard: https://github.com/Tudmotu/gnome-shell-extension-clipboard-indicator

I haven’t found anything preinstalled to manage extensions. Seems like something that would be readily available in “Settings”. 😮‍💨

Anyways, I started this at 9am, I hope to have something working by noon, but time is dwindling. I just spent some time on creating an icon in figma. No matter what I do, it’s still hard to see the “TXT” in the icon. I may just ditch it and just use the mic, but I’ll leave it in for now. Anyway, I hope Gnome supports SVG, which might render a little nicer. Let’s move on. We have some functionality to create.

Golly, documentation is THIN for gnome extensions.

I’m simply trying to get a button in the tray, when clicked it will change color. Also, reloading extensions is still a CHORE. I have to log out, then log back into gnome each time. Tedious.

I found a solution to that here: https://www.reddit.com/r/gnome/comments/eb4pn9/how_do_i_reload_a_gnome_shell_extension_during/

I’m using a reload.sh script to load up another session of gnome, which naturally reloads all the extensions.

dbus-run-session -- gnome-shell --nested --wayland

My SVG isn’t looking great in there though. I may have to use a ready-made system icon.

As you can see, the icon is squished, and also doesn’t change color when clicked.

I’ve got the icon working now, but there still is styling issue, where the icon seems a little small.

I’ve messed with getting Vosk working appropriately. I’ve tried a few of the suggested methods, but I’m having a lot of issues making my microphone accessible in nodejs with the ‘mic’ library.

I’m currently leaning towards running vosk as a docker service with the following docker-compose.yml

version: '3'

services:
  vosk:
    image: alphacep/kaldi-en
    ports:
      - "2700:2700"

So far, only one test script that I’ve tried actually worked.

#!/usr/bin/env python3

import asyncio
import websockets
import sys
import wave

async def run_test(uri):
    async with websockets.connect(uri) as websocket:

        wf = wave.open(sys.argv[1], "rb")
        await websocket.send('{ "config" : { "sample_rate" : %d } }' % (wf.getframerate()))
        buffer_size = int(wf.getframerate() * 0.2) # 0.2 seconds of audio
        while True:
            data = wf.readframes(buffer_size)

            if len(data) == 0:
                break

            await websocket.send(data)
            print (await websocket.recv())

        await websocket.send('{"eof" : 1}')
        print (await websocket.recv())

asyncio.run(run_test('ws://localhost:2700'))

The problem here is that it’s sending a .wav file, not opening the microphone and transcribing the output.

That’s enough for today. I’ll pick this project back up at some point.

]]>
108