Showing posts with label Programming. Show all posts
Showing posts with label Programming. Show all posts

Tuesday, March 25, 2025

Why the Model Context Protocol (MCP) Is Confusing

**Update 3/26/2025 ** 

OpenAI added MCP support to their agent SDK.


--------------------------------

On Nov. 25, 2024, Anthropic published a blog post about the Model Context Protocol (MCP) .  It didn’t get too much attention at the time because it wasn’t an announcement of much significance.  Suddenly in 2025, MCP has gotten a lot of hype and attention and have also caused a lot of confusion as to why there are so many YouTube face talking about it.

It was nice that Anthropic published how they connect Claude with tools in the Claude Desktop app even if the post was a bit of marketing to sell it as an standard and to encourage an open community.  There is a technical aspect (a protocol) to it, but it felt like it was a business play to get developers to extend Claude with plugins.

Large Language Models like Claude cannot perform any actions.  They’re like a brain with no body.  They might know that an email should be sent, but they can’t actually send the email.  To connect the “thought” (send email) with the action requires basic programming.  Using MCP is how Anthropic does it but it isn’t the only way.  Let’s take a look at various ways that this is accomplished and then see how MCP fits in.

You Are the Agent

In this scenario, a person goes to claude.ai and has a conversation with Claude about writing an email to invite someone to lunch.  Claude generates the email body letting the person know to copy it an email program to send.  The person manually copies that text into their email or calendar app and sends the invitation.  The person is the agent because they are performing the action.

Using an AI Assistant

Here, a person uses an app (this can be a web app, desktop app, mobile app, etc.) such as a personal assistant ala Jarvis from Iron Man.  The user asks Jarvis to send an email to invite someone for lunch.  Jarvis composes the invitation message and sends the invitation through a calendar app so that it also records the event on the calendar.  So how does Jarvis do this?

Method 1

  1. Jarvis sends your question to LLM along with prompts describing available tool(s).
  2. Jarvis looks at the response to determine what tools to use.
  3. Jarvis executes the chosen tool(s) through the tool API
  4. The results are sent back to LLM
  5. LLM formulates a natural language response
  6. The response is displayed to you!

In the early days (2023), this might be done like this:

The user speaks to Jarvis, “Jarvis, invite Thor to lunch next Wednesday.”

The Jarvis code passes the text to an LLM along with an additional prompt:

Respond to the user, but if the request is to add to a calendar then respond in JSON:

  {
    Tool: “calendar”
    Invitee: name
    Date: date
    Time: time
    Body: message
  }

The Jarvis program will get the response and if the response is a tool response it will parse the JSON and call the calendar API.

The Jarvis code calls the LLM again with the text, “tell the user that the invite was successfully sent” and then return the response to the user.

Method 2

  1. Developer registers available tools
  2. Jarvis sends your question to LLM
  3. LLM analyzes the available tools and decides which one(s) to use
  4. Jarvis executes the chosen tool(s) through the tool API
  5. The results are sent back to LLM
  6. LLM formulates a natural language response
  7. The response is displayed to you!

An enhancement was added to many LLM’s API to allow developers to register tools, their purpose and parameters. The Jarvis code will register a tool called “calendar”, give it a description such as “Tool to add, update and remove user’s calendar.”, and what parameters it needed.

Now, when Jarvis passes “Jarvis, invite Thor to lunch next week,” to the LLM, it will respond with JSON and Jarvis can call the calendar API.

The Jarvis code calls the LLM again with the text, “tell the user that the invite was successfully sent” and then return the response to the user.

Method 3 (MCP)

  1. User registers available tools.
  2. Jarvis sends your question to Claude
  3. Claude analyzes the available tools and decides which one(s) to use
  4. Jarvis executes the chosen tool(s) through the MCP server who calls the tool API
  5. The results are sent back to Claude
  6. Claude formulates a natural language response
  7. The response is displayed to you!

With MCP, the user (on desktop/mobile) or developer (on cloud) registers MCP servers with Jarvis.  Jarvis can then get the tools description from the MCP Server which it passes to the LLM.

When Jarvis passes “Jarvis, invite Thor to lunch next week,” to the LLM, the LLM will determine the tool to use.

Jarvis will then call the MCP server to send the calendar invite.

The Jarvis code calls the LLM again with the text, “tell the user that the invite was successfully sent” and then return the response to the user.

Comparison

With the MCP, tool registration is passed to the user and the tool description is handed off to the tool developer, but otherwise the steps remain the same.

Method 2

Method 3

  1. Developer registers available tools
  2. Jarvis sends your question to LLM
  3. LLM analyzes the available tools and decides which one(s) to use
  4. Jarvis executes the chosen tool(s) through the tool API
  5. The results are sent back to LLM
  6. LLM formulates a natural language response
  7. The response is displayed to you!
  1. User registers available tools.
  2. Jarvis sends your question to Claude
  3. Claude analyzes the available tools and decides which one(s) to use
  4. Jarvis executes the chosen tool(s) through the MCP server who calls the tool API
  5. The results are sent back to Claude
  6. Claude formulates a natural language response
  7. The response is displayed to you!

Comparing the different methods shows that the steps are the same but just the implementation is different.  This is one reason there’s a lot of confusion because seems to be very little benefit.

Having a standard protocol can be advantageous but only when all the LLM adopts it otherwise it is just how to interact with Claude.


MCP servers are potentially reusable and might ease integration which is a benefit since it’ll be like having only one API to learn.  This requires wide adoption and availability which isn’t a given even if it is backed by one of the big LLM providers.

Shortcomings of the MCP

As a protocol, there are a lot of shortcomings and technical benefits are minor.  

Some technical shortcomings are:

  • There’s no discovery mechanism other than manual registration of MCP servers.
  • There’s now extra MCP servers in the tech stack that can be achieved by a library.
Thus the main benefits will mainly come if the protocol is adopted as a standard.

Saturday, February 15, 2025

WebAssembly (WASM) with Go (Golang) Basic Example

I first wrote about using Go for WebAssembly (WASM) 6 years ago right before the release of Go 1.11 which was the first time Go supported compiling to WASM.  Go's initial support for WASM had many limitations (some I listed in my initial article) which have since been addressed so I decided to revisit the topic with some updated example of using Go for WASM.

Being able to compile code to WASM now allow:

  • Go programs to run in the browser. 
  • Go functions to be called by JavaScript in the browser.
  • Go code to call JavaScript functions through syscall/js.
  • Go code access to the DOM.

Setup

Go's official Wiki now has an article on the basics of using Go for WASM including how to set the compile target and setup.  

A quick summary of the steps to the process:

Compile to WASM with the output file ending as wasm since it's likely that the mime type set in /etc/mime probably use the wasm extension.

> GOOS=js GOARCH=wasm go build -o <filename>.wasm

Copy the JavaScript support file to your working directory (wherever your http server will serve it from).  It's necessary to use the matching wasm_exec.js for the version of the Go being used so maybe put this as part of the build script.

> cp "$(go env GOROOT)/lib/wasm/wasm_exec.js" .

Then add the following to the html file to load the WASM binary: 

<script src="wasm_exec.js"></script>
<script>
    const go = new Go();
    WebAssembly.instantiateStreaming(fetch("main.wasm"), go.importObject).then((result) => {
                go.run(result.instance);
    });
</script>

Keeping It Running

It is a good starting point but the Go code example is too simplistic.  It only demonstrates that the WASM binary that is created by Go can be loaded by having write a line to the browser's console.  The Go program basically gets loaded by the browser, prints a line and exits.   Most of the time, it's probably desirable to have the WASM binary get loaded and stay running.  This can be achieved by either having having a channel that keeps waiting:

c := make(chan struct{}, 0)
<- c

or even easier:

select {}

Have either of these at the end of main() will keep the program alive after it gets loaded into the browser.

In Place of JavaScript

Being able to access the DOM excites me the most because it allows me to avoid writing JavaScript followed by being able to run Go programs in the browsers.  While I think the inter-op between Go and JavaScript is probably the most practical application, it's not something I've had to do much since I'm not a front-end developing doing optimizations or trying to reuse Go code between the front-end and back-end.

I don't mind using HTML for UI development or even CSS, I'm just personally not a fan of JavaScript.  This isn't to say that it is bad, just I prefer other languages just like some people prefer C++, Java, Python, etc.  I don't have fun writing JavaScript like I do with with Go if though I know JavaScript.

Take a basic example of a web app (index.html) with a button to illustrate:

<!DOCTYPE html>                                                                 
  <html lang="en">                                                                  
  <head>                                                                          
      <meta charset="UTF-8">                                                        
      <meta name="viewport" content="width=device-width, initial-scale=1.0">        
      <title>Example</title>                                         
  </head>                                                                           
  <body>                                                                          
                                                                                  
      <button id="myButton">Click Me</button>                                                                                                     
                                                                                  
  </body>
  </html>

JavaScript is used to attach an event to it so that when the button is clicked, an alert message pops up:

    // Select the button element                                            
    const button = document.getElementById('myButton');                     
                                                                                  
    // Attach an event listener to the button                               
    button.addEventListener('click', function() {                           
       alert('Button clicked!');                                           
    }); 

With WASM, the JavaScript can be replaced with Go code:

 package main                                                                       
                                                                                     
  import (                                                                                                                                                   
                                                                                     
      "honnef.co/go/js/dom/v2"                                                       
  )                                                                                  
                                                                                  
  func main() {                                                                   
                                                                                  
      document := dom.GetWindow().Document()                                      
           
      // Select the button element                                                                       
      button := document.GetElementByID("myButton")                               
                                                                                  
      // Attach an event listener to the button
      button.AddEventListener("click", false, func(dom.Event) {                   
          dom.GetWindow().Alert("Button clicked!")                                
      })                                                                          
                                                                                  
      select {}
  }

In this case, the Go code looks extremely similar to the JavaScript code because I'm using the honnef.co/go/js/dom/v2 package.  It is a Go binding to the JavaScript DOM APIs and I find that it makes it more convenient than using syscall/js directly.

Why do I prefer this over writing JavaScript especially when they look so similar?   The main reason is that most code is not just calling an API.  There's other logic that are implemented and for those, I can use Go and Go's libraries along with the benefits of a compiled and type safe language. 

There are still things that needs to be considered before just using Go and WASM for general consumer production web app.  The binary size can be large so it needs to be considered for your audience, but if I'm doing my own hobby project for myself or corporate apps where I know the user have fast connections, or if app performance and functionality outweighs the initial download and memory usage, I'd try to use it.

Monday, February 10, 2025

The Death of Software Engineering Is Premature

There is a lot to be excited about when it comes to advancement in language models.  The people that benefit the most are software engineers because it can enhance the productivity of knowledgeable engineers. 

While a simple prompt such as "build me a web app with a conversation interface," will result in a functional web app (this is great because building a basic commodity application is now accessible to everyone),  these aren't the type of applications software engineers are normally tasked to write.  Software engineers are tasked with building multi-layered and evolving software that can't be fully described with a few prompts.  To build the software that businesses need requires skilled and knowledgeable people to direct the engineering work of the AI.

I built a simple app to filter news headlines that is small enough to be digestible in a post that I think shows how far a layman can go with building software using LLMs and why software engineers are still needed.  

The First Prompt

Let's start with what many executives might think makes software engineers unnecessary: 

Build me an app that will filter out political articles from Google News.

ChatGPT understood enough to generate an app using keywords to filter the headlines.  One time it created a React app and a second time it used Python.  Both times it used newsapi.org to get the news feeds and require you to understand how to build and run the app.  The main issue is that the app isn't really what I wanted.  It provides news search which the results are then matched with keywords to decide what to filter out.  I wanted the news that I normally see when I visit news.google.com minus the political articles so I tell ChatGPT precisely that:

I don't want to search for news.  I want to see the articles that I see when visiting news.google.com minus political articles

The first time I asked ChatGPT it understood this enough to switch to using the Google News RSS feed which is excellent\!  The second time, it concluded that the way is to scrape news.google.com.  Both of these prompts highlight that some specialized knowledge is needed.  Does the programming language matter?  Should it use RSS or web scraping?  How do you run the code it generates?  Can these questions be ignored? Who in the organization does the CEO expect to be able to answer these questions?

The Second Prompt

While the CEO might not be able to use the AI, it is possible that a non-engineer or someone who doesn't know how to program can know enough to give more details in the prompt that improves on the first prompt.  A technical product manager could give the AI the extra details: 

  1. single page web app using material design   
    1. have a header with the company logo on the left  
    2. have a settings menu on the right side of the header  
    3. ...  
  2. web app will get the list of headlines to display   
  3. pull headlines from RSS feeds and have a LLM return the items that are not political  
  4. build it with language X on the backend and basic Javascript and CSS on the front end.

(4) was included to demonstrate how software often needs to fit in with a business' existing infrastructure.  If the AI returns a React app but there is no infrastructure to build and deploy React apps then we'd be looking at additional costs and efforts to add that ability simply because the AI had no knowledge of what is reasonable for a specific business.

LLMs are capable of generating this app although the generated code didn't actually work and I had to guide it to how to fix things but for now let's assume that the LLM generated a fully working app. 

If we were to stop here then we might conclude that software engineers aren't needed anymore, but I believe the conclusion is actually that for companies that previously did not have engineers that they now have access to some skills their organization never had before.  For example, a small company who has an office manager built a spreadsheet that they're running their business on would benefit because while their spreadsheet worked it is still limited.  The next step to expand the spreadsheet that would previously require some coding knowledge to extend the worksheet can now be done by the office manager with the aid of an AI.  There is cost saving for the company because they didn't have to hire a consultant (these companies probably don't have enough work to hire a full time engineer to be on staff).

Remember that I left out a number of things that the AI cannot do including turning the code into an app and deploying the app, but the biggest hurdle would be if the AI did not meet all the requirements for the app initially.  I had success prompting the AI to make certain changes and enhancements but multiple times the AI would make a change that didn't work and go further-and-further down the rabbit hole and no prompting got it to fix the problem until I have it very specific instructions on how to fix ("In function X, change the max/min values to be Y for variable V").  

Basically, maintaining, fixing and enhancing the app is where the challenge is and that's where most engineers are spending their time.

The Third Prompt

Prompt 3 (in real life was actually what I did first) was the fastest and most productive way towards building a working application and this was to have a design and give the AI many specific implementation instructions.  I knew the architecture, algorithm and code structure and used that knowledge to guide the AI so that it was essentially very fast typist of code:

  • Write a FetchRSS function that takes in a string value for the RSS feed configuration file path.  
  • In FetchRSS, open up the configuration file and loop through each line to get the URL.  
  • For each URL, fetch it and parse the RSS response into a slice of strings  
  • Oh, ignore any lines that are blank or starts with '\#'  
  • ...

Since I enjoyed typing out code, I tended to write my own stuff but for the "boring" codes (e.g. handling when err \!= nil) I'll have the AI write it.

I was able to complete the code much faster, make fewer trips to look up references and documentation and with few bugs from typos.

Here's the problem, though.  While AI is capable of generating valid code, it isn't "working" code.  It isn't code that can be used directly in a business' infrastructure.  AI is still struggling to understand an entire code base and writing code that works within a business requires also understanding the environment the code is running in.  If the code will be run in complete isolation then it might be able to run but even a simple function such as "GetUserName" depends on where the user name is stored?  What integration must the AI be aware of in order to get the user name?  In a real environment, the AI simply just gives code snippets that the engineer must still adapt into the organization's infrastructure.

Conclusion

Having an AI capable of building software for your business is not realistic.  My example app is too simplistic to be any company's product and it still wasn't able to do that.  

Having a knowledgeable person on staff using AI will increase your staff's ability to do their jobs better and allow companies to do things they previously couldn't but most of these things are not things software engineers typically do.   If companies try to have non-engineers do software engineering work with the AI will likely result in decreased productivity.  Any gains from having something done quickly at the beginning will also be quickly overshadowed by the inability to maintain and enhance the software.

It is ultimately a decision companies have to make for themselves.  Is a % reduction in compensating engineers a greater value then expanding by factors the productivity and capabilities of the engineers (reduce cost vs growth)? 

What is complex today will not be as complex tomorrow.  They will become common and AI will be able to take care of those things, but as AI takes over more of the common stuff that will allow engineers to tackle the next level of complexity with more originality because businesses will need it to survive.  There are leaders at technology companies boasting about being able to get rid of their software engineers or how they will stop hiring more engineers.   These leaders are making a choice to make cheap commodity products in exchange for growth and innovation, but they might find themselves racing to the bottom instead of accelerating to the top.  

Can AI someday outpace people?  Maybe.  But it's not now.  Declaring that engineers aren't needed is pre-mature.


Using AI/LLM to Implement a News Headline Filter

There are news topics that I'd rather avoid being bombarded with, but options for filtering headlines on the news sites are generally very limited.  You can tell Google News to block certain sources and there's an "fewer like this" option which doesn't seem to do anything, but neither will block topics (e.g."politics", "reality TV shows", etc. ) so you still end up getting bombarded with things you don't want to see even in your "personalized" views.  Fortunately, sites like Google News provide RSS feeds that makes it easier to get the list of headlines, links and descriptions and avoid trying to scrape websites which can be brittle as the sites can change their layout at any time.

I decided to write my own filter take out things that I'm not interested in and publish the result to a site that I can access from any browser.  The simplest way (and one that doesn't even need site to host the result) is to have a list of blocked words and drop any headlines with those words.  As long as the app can access the RSS feed even a mobile app can easily handle this kind of filtering, but creating and maintaining an effective block list becomes a challenge.  A political news article isn't going to be "Political Article on City Council Votes On Artificial Turf at Parks.  This is a good use case for using a language model to categorize a headline instead of using a keyword filter.

I created a prompt in natural language explaining what I don't want to see along with some additional details to add to common definitions and then sent my instructions with the list of headlines to the language model and have it return a filtered list:
Remove all political headlines.  Headlines that includes political figures, and celebrities who are active in politics such as Elon Musk should also be removed.
Being able to use natural language to handle categorization makes it so much easier to build the app.  The categorization would've been the most challenging part of the project, but the LLM allowed me to get good results very quickly (took just a few minutes to get the first results but some more time to tweak the prompt).  Another benefit with the large language models such as Gemini is that it understand multiple languages so while I gave my instructions in English, it can filter headlines in French, Chinese, Japanese, etc.

Using an language model does mean giving up some control such as relying on what the model interprets "political" to mean.  Prompts can help refine the model's interpretation but sometimes a headline makes it through the filter and it is not as easy to determine compared to being able to see the algorithm and determine the reasoning.

I encountered a problem where the LLM's response stopped midway.  This was because  LLMs have limits on input and output tokens (how much info you can send out and the size of the response it will send back).  The input token limit is so high that I'm not likely to have enough headlines to even remotely approach it's limit (Gemini is 1 million tokens for the free tier and 2 million tokens for the paid tier and I'll be using around 20,000).  The output limit is much smaller (~8k) so if I wanted it to send back the complete details of the filtered headlines (title and link) it won't be able to.  To address this problem, I send the LLM the headlines with an index and have it return just the index.  If it was a 100 headlines then size of the output is less than 200 tokens.

A cron job will execute the program and write the result as JSON to the server where the web page will load the JSON  and display the headlines:


The code is on Github and you can see it is very simple program.

Wednesday, January 1, 2025

Powerline Error in VIM with Git

On Fedora 41, opening a Go file that is checked into Git results an error message that flashes briefly before VIM shows the file.  I'm not sure what is the exact cause but it went away when I uninstalled powerline and reinstalled it.   When uninstalling just powerline, it also uninstalled VIM for some reason the re-install was actually `dnf install vim powerline`.

Saturday, December 28, 2024

Framework 13 (AMD) First Impressions

I recently purchased a Framework 13 (AMD) laptop to replace my 7 years old Razer Stealth as my travel companion and wanted to share my first impressions after taking it on a trip and using it for about a month.

My main criteria for a travel laptop is that it must be light, but as I've gotten older I've added a few more things to look for:

  • Display must support good scaling.   As displays have improve their resolutions, everything shown has gotten smaller so I have to scale it up for it to be comfortable for my eyes. 
  • Replaceable battery.  As evident from using my previous laptop for 7 years and still not feeling the need to upgrade, I tend to keep my laptop for a long time especially since I don't rely on my laptop to be my primary driver.  While most parts of a laptop are capable of lasting awhile, batteries are a different story.  I've had to replace the battery on my Razer twice because they started to swell.  This is not exclusive to Razer as I've had it happen on Macbooks and Pixelbooks as well.
  • Linux support.  I mostly use Linux especially for development so I'm most comfortable with it, but I occasionally do have use for Windows (some games the family plays, camera apps, etc.).  They key reason, though, is that Windows is commercial and I don't want to be forced to pay to upgrade if it is not necessary.  The Razer Stealth ran Windows 10 and Microsoft says it's not compatible with Windows 11 so I either have to live without security updates or try to install Linux when Razer doesn't support Linux in anyway.  Having good Linux support is a way to future proof the laptop somewhat.
Given these criteria, I settled on the Framework.  It is lightweight (1.3 kg/2.9 lb) with a 13.5" 2880x1920 120Hz 2.8K matte display (the uniqueness of Framework is that it is modular so you can have different displays even if new ones are released in the future) that is a 3:2 ratio which allows better scaling especially with Linux.

The battery is replaceable (nearly everything on the Framework is replaceable/up-gradable) and it fully supports Linux (Ubuntu and Fedora being the officially supported ones but seems like most Linux distributions will work) even the finger print sensor.

First Impressions

Ordering the Framework, especially the DIY version, presents you with more choices than the typical ordering process.  Not only do you pick how much disk storage you want, you have options for what brand/specs you want and this is for all the major components:  display & bezel, keyboard, CPU,memory, storage, memory, ports.  If you're familiar with computer components, most of the parts are understandable from the description but for the keyboard it didn't explain what the difference is between US-English, International-English Linux and International-English (International English has the Euro key and Linux swaps out the Windows key with a Super key).

I was impressed how quickly the laptop shipped and arrived given it that it comes directly from Taiwan (where it was manufactured) and that each order have different sets of expansion ports (although I did pick a pretty standard set of ports).  It arrived faster then the 5-7 business days it listed when I ordered.

The packaging was nicely done to prevent anything from shifting during transit and everything is recyclable. The packaging include a screw driver needed to assemble all the components.

It took about 20 minutes to put everything together and the instructions were good.  A lot of people can probably figure it out even without the instructions, but the instructions really prepares you.  It suggests putting on the bezel starting at the bottom first and that definitely allowed it to fit better without fidgeting and it warn that the first boot will take longer so you don't worry about whether it not immediately starting up meant that you did something wrong.

It gives pretty good instructions on installing the operating system.  For Windows, it anticipated that you might not want to use a Microsoft account so it tells you how to bypass it and how to deal with the fact that during installation the laptop drivers aren't there so how do you get pass the part where Microsoft wants to have networking working just to complete the installation.  For Linux, the instructions was decent but maybe a little outdated especially with the screenshot.  Although Fedora is one of the two officially supported distributions, the Ubuntu guides seemed more comprehensive.  They also favor Gnome in their instructions.

You can get the Framework with either an Intel process/motherboard or AMD processor/motherboard and although the general sense that the AMD version performs better there's more information on the Intel stuff.

The display looks very nice with good contrast and brightness.  It's very comfortable to the eyes and scaling in Linux was not a problem.  No complaints about the touch pad and it worked with Linux out-of-the-box.  The keyboard is comfortable with good travel and spacing.  It wasn't too squishy.  If the Thinkpad is the bar, this isn't as good as that, but better then the last Macbook Pro I used . 

The fingerprint sensor worked out-of-the-box as well, but if you aren't using Gnome, you need to use the command-line tool, fprintd-enroll, to register your finger print. 

It's not clear whether Framework thinks you should run tuned-ppd to manage power profiles for both Intel and AMD or whether that's just for Intel and to stick with power-profiles-daemon for AMD.  On Fedora 41, if you install power-profiles-daemon then each time it wants to change the profile (such as when you plug/un-plug the power) SELinux will block it and give you a warning.

Although I had no problems with WIFI, the wifi chip it comes with always seem to have a weaker signal than other devices.  I think some people swap it out with the one that comes with the Intel board so it's something I'm watching out for.

I've been pretty happy with the laptop so far and hope it'll last a long time.  I like the company's mission and hope they continue to succeed with their vision of modular, sustainable and environmentally friendly laptops.

System Configuration

  • AMD Ryzen™ 5 7640U
  • 2880x1920 120Hz 2.8K matte display
  • Crucial RAM 32GB Kit (2x16GB) DDR5 5600MHz
  • WD_BLACK 2TB SN850X NVMe SSD Solid State Drive - Gen4 PCIe, M.2 2280, Up to 7,300 MB/s

Thursday, January 5, 2023

Add Build info to a Go Binary

Having the build info directly in a binary is useful in helping to identify one binary from aonther especially if you do a lot of compilations.  Manually updating that information in to the source before build is cumbersome and error prone so it's better to automate it.

This can be done in Go using -ldflags with the build command.  For example, if you have a main.go file such as this:

package main                                                                    

import "fmt"
var(
    build string
)

func main() {
    fmt.Printf("build date: %v\n", build)                            
 
)
Then you can build it with -ldflags to change the value of build with the current date when using the go build command:
go build -ldflags "-X main.build=`date`" main.go

Be careful that other parts of your program doesn't change the value during runtime since it is just a variable.

To make it a little safer, you can put the variables into another package and don't allow it to be updated.   You can, for example, create a package called "buildinfo"

package buildinfo                                                                      
                                                                                   
var (                                                                              
    builddate = "0"                                                                
)                                                                                  
                                                                                   
func GetBuild() string {                                                           
                                                                                   
    return builddate                                                               
}                                                                                  

that is called by your main.go:

package main                                                                       
                                                                                   
import (                                                                           
    "fmt"                                                                          
                                                                                   
    "example/buildinfo"                                                                
)                                                                                  
                                                                                   
func main() {                                                                      
                                                                                   
    fmt.Printf("build date: %v\n", build.GetBuild())                                                             
                                                                                
}

You will then build your application with:

go build -ldflags="-X 'example/buildinfo.builddate=`date`'"

Running the program will now output something like this:

build date: Thu Jan  5 12:33:38 PM

Friday, April 9, 2021

Keep Go Module Directory Clean with GOMODCACHE

Go makes downloading projects and their dependencies very easy.  In the beginning there was go get which will download the project source code and its dependencies to $GOPATH/src.  With modules, all the dependencies are downloaded to $GOPATH/pkg/mod.  The ease of downloading and the lack of management control in the go command means that is easy for the two directories to grow in size and to lose track of which project led to the download of a particular package.

I recently started to play around with the Fyne UI toolkit.  I didn't initially know what other packages it would download so I wanted to have Fyne and its dependencies in their own area.  The go command has a flag -pkgdir that is shared by the various commands.

The build flags are shared by the build, clean, get, install, list, run, and test commands:

...

-pkgdir dir

install and load all packages from dir instead of the usual locations. For example, when building with a non-standard configuration, use -pkgdir to keep generated packages in a separate location.

This didn't work as I expected because it didn't seem like it did anything at all.  Using the command

go build -pkgdir /tmp

resulted in all the downloaded package still going to $GOPATH/pkg/mod.

What did work (thanks to seankhliao) is to set the GOMODCACHE variable which sets more then the cache location but also the package location:

GOMODCACHE=/tmp go build

All the downloaded dependency packages will now be downloaded to /tmp rather then $GOPATH/pkg/mod.  

Honestly, I'm not really sure what -pkgdir is really suppose to do.  Maybe it is only for things that the build command generates?  Why does it do when using with go get?

Wednesday, April 7, 2021

Local Go module file and Go tools magic

I really value that when working with Go there are no "hidden magic" in source code.  Go source code are essentially WYSIWYG.  You don't see decorators or dependency injections that might change the behaviors after it is compiled or run that requires you to not only have to understand the language and syntax but also having to learn additional tools' behavior on the source code.  While this is true of the language it is not true of the go command for Go's module system.

I've personally found Go modules to be more confusing then the original GOPATH.  I understand that it solves some of the complaints about GOPATH and also addresses the diamond dependency problem, but it also adds complexity to the developer workflow and under-the-hood magic. Maybe that's to be expected when it is going beyond source code management and adding a whole package management layer on top, but I'd be much happier to have to deal with this added complexity and burden if the solution was complete (how about package clean up so my mod directory isn't growing non-stop?)!

Modules adds the go.mod file that tracks all a project's dependencies and their versions.  This introduces a problem when one is developing both applications and libraries since it is possible that the developer have both the released production version and in-development version of libraries locally.  To point your application at the library without constantly changing the import path in source code, the replace directive can be used, but when committing the code it is not ideal to submit the go.mod with the replace directives in it as it will likely break the build for someone else checking out the code and can expose some privacy data (the local path that might contain the user name).

Now developers have to add the replace directives locally, remove them right before submission and then put them back (without typos!).  Fortunately, in Go 1.14, the go commands (build, clean, get, install, list, run, and test) got a new flag '-modfile' which allows developer to tell it to use an alternative go.mod file.  The allows a production version of go.mod file to not have to be modified during development/debug and a local dev version of go.mod that can be excluded from getting committed (i.e. .gitignored).  

This can be done on a per-project level by adding -modfile=go.local.mod to go [build | clean | get | install | list | run | test]:

go build -modfile=go.local.mod main.go

Note that whatever the file name is, it still has to end on .mod since the tool assumes to create a local go.sum mod based on a similar name as the local mod file except with the extension renamed from .mod to .sum.

To apply the use of go.local.mod globally, update "go env":

go env -w GOFLAGS=-modfile=go.local.mod

go env -w will write the -modfile value to where Go looks for its settings:

Defaults changed using 'go env -w' are recorded in a Go environment configuration file stored in the per-user configuration directory, as reported by os.UserConfigDir.

So the flow that Jay Conrad pointed out in this bug thread would be as follows:

  1. Copy go.mod to go.local.mod. 
  2. Add go.local.mod to .gitignore.
  3. Run go env -w GOFLAGS=-modfile=go.local.mod. This tells the go command to use that file by default.
  4. Any any replace and exclude directives or other local edits.
  5. Before submitting and in CI, make sure to test without the local file: go env -u GOFLAGS or just -modfile=. 
  6. Probably also go mod tidy.

Tuesday, April 6, 2021

Listing installed packages on Fedora with DNF

To list the packages that are user installed:

dnf history userinstalled

To list all installed packages:

dnf list installed

Sunday, January 17, 2021

My Systems (2021)

Updated 8/7/2025 - Self built daily driver

Updated 5/21/2023 with Beelink system

2021 brings upgrades to the computers in the house that has been fairly static over the past 7-8 years.   I got a couple of new systems and repurposed some parts from the old systems so this post is mainly to inventory the new configurations for my own reference.

Daily Driver (Self-Built)

For years, I've went with mini PCs because they were quiet and did not take up much space.  I had mine on the keyboard tray under the table top so they were out-of-the-way.  The shift to a regular desktop PC was triggered by two things:  
  1. The deprecation of Windows 10 meant that my kids will need a new PC that can run Windows 11.
  2. The use of local LLMs requires a GPU.
The PN50 4800U was my daily driver and came with Windows 11 although I used Linux on it 99.9% of the time.  It obviously cannot have an GPU which made trying LLMs on it painfully slow.  Instead of getting another PC, I decided to retire the kid's PC and replace it with my 4800U.   I can't do anything about the size of the desktop PC, but the beQuiet CPU fan addressed the noise concern.  The desktop is very quiet and normally I can't tell if it is on by sound and have to look to see if the fans are spinning.

The larger case allowed me to install a GPU that'll enable me to run local LLMs.  This time, I'm just running Linux and not bothering with Windows.  If I do need Windows, I have a laptop that still has it or I'll just borrow the kids.

Asus PN50 4800U

  • Ryzen 7 4800U [Zen2] (8 cores / 16 threads, base clock 1.8GHz, max 4.2GHz - 8 GPU cores - RX Vega 8, 15W)
  • 32 GB Crucial DDR4 3200Mhz  RAM (2x16GB)
  • 1TB Samsung 970 EVO Plus (M.2 NVMe interface) SSD
  • 500GB Crucial MX500 SATA SSD (2.5")
  • Intel WIFI 6, BT 5.0

The PN50 replaced my trusty Shuttle DS87 as my daily driver.  All the components are new except for the two Dell U2311H monitors, keyboard and mouse.  I added a third monitor, Dell U2421HE, because it has ethernet and a USB-C interface that can connect with the PN50 for DisplayPort, USB hub and ethernet.  This lowered the number of cables between the PN50 and peripherals and reduced clutter on my desk.

Despite its small size, the PN50 is still very high performing, but you do pay a price premium for having something small AND fast.  I did lose some connectivity (fewer USB ports and only 1 Ethernet port).  The USB ports can be addressed with a hub and I use the monitor's built-in ethernet port for the one I loss since the PN50 only has one.

The Dell AC511M is a USB soundbar and can be attached to the monitor stand (not to the monitor itself).  It draws its power from the USB connection but I found three flaws with it: 
  1. It has no power button so it turns on when the PC turns on.  To use the audio-in jack and the speaker means the PC must be turned on. 
  2. The speaker has a hiss to it like many speakers but with no power button the hiss is always there.  I had to plug something into the headphone jack so I don't hear it.
  3. When something is plugged into the audio-in jack no audio goes through the USB.  If there are two audio sources (e.g. PC and music player) they need to share a connection.  I have two PCs connected to the monitor (one on display port and one on hdmi) and I can't have one play through USB and one through the audio in without plugging-and-unplugging the audio-in cable.  Instead, I have a cable from the monitor's audio-out to the soundbar's audio-in and each machine plays through the DP/HDMI outputs.
My ideal system would still be to have a something like the DS87 housing with a Ryzen 4700G (or 5700G), but those CPUs aren't very readily available and there is not DS87-like small form factor cases for them.  Update:  The day after I posted this, Shuttle announced this exact PC.  ^_^;  With my new monitor, though, I would like the new Shuttle to have an USB Type C connection, but at least if I wanted to get more power then I know the option is now out there!

Asus PN50 4500U

  • Ryzen 5 4500U [Zen2] (6 cores / 6 threads, base clock 2.3GHz, max 4.0GHz - 6 GPU cores - RX Vega 6, 15W)
  • 2x 8GB 3200 DDR4 so-dimm by SK hynix
  • Intel 660p Series m.2 500GB SSD
  • Intel WI-FI 6 (GIG+) + BT 5.0
  • *Crucial 128BG m4 2.5" SSD

This system replaces my wife's Shuttle XH61 system and is an upgrade across the board over its predecessor.

System 3 (Shuttle DS87)

  • Shuttle PC DS87
  • Intel Core i7-4790S Processor (4 cores / 8 threads, 8M Cache, base clock 3.2 GHz, max 4.0GHz, 65W)
  • Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSD (MZ-75E500B/AM)
  • 2 x Crucial 16GB Kit (8GBx2) DDR3 1600 MT/s (PC3-12800)
  • *Intel Network 7260.HMWG WiFi Wireless-AC 7260 H/T Dual Band 2x2 AC+Bluetooth HMC
  • *Samsung 840 EVO Series 120GB mSATA3 SSD

This Shuttle had been my reliable daily driver for over 6 years running Linux.  I repurposed an Samsung SSD and Intel wireless card from my Asus VivoMini to install Windows and add WIFI and bluetooth to the system.   The antennas that was in the VivoMini was hard to extract so I took the antennas from an old ASUS Chromebook laptop that wasn't being used anymore.  

The VivoMini was being used for kid's remote/distance learning but was a bit under-powered for handling some of the video conferencing features so this system will now take its place.

ASUS PN50 4300U

  • Ryzen 3 4300U [Zen2] (4 cores / 4 threads, base clock 2.7GHz, max 3.7GHz - 5 GPU cores - RX Vega 5, 15W)
  • 16 GB Crucial (CT8G4SFRA32A) DDR4 3200Mhz  RAM (2x8 GB)
  • 500GB Samsung 970 EVO Plus (M.2 NVMe interface) SSD

This system is meant to be a more portable system for when I'm working at another location.  I paired this up with a portable monitor rather then getting a laptop since I don't need this to be a mobile system but one that I can easily transport.


Beelink SER5

This latest addition was added in 2023 as a secondary gaming PC and is used for school.  The specs are decent given the price of under $300 including Windows.

  • Ryzen 5 5500U (6 cores / 12 threads, base clock 2.1GHz, max 4.0 GHz, 7 core GPU,  @ 1800 MHz, 15W TDP)
  • 16 GB DDR4
  • 500GB NVME M.2 SSD
  • WiFi 6
  • BT 5.2
The system came with Windows 11 Pro.

----------------

System 4 (Shuttle XH61)

  • Intel Core i7-2600S Processor (4 cores / 8 threads, 8M Cache, base clock 2.8 GHz, max 3.8GHz, 65W)
  • *Seagate 300GB 7200RPM HDD Cosair MX500 CT500MX500SSD1 500GB 2.5in SATA 6Gbps SSD
  • TP-Link USB WiFi Adapter for Desktop PC, AC1300Mbps USB 3.0 WiFi Dual Band Network Adapter with 2.4GHz/5GHz High Gain Antenna, MU-MIMO
  • 8GB RAM

This system was originally put together in 2012 (with an SSD) and even in 2020 was a perfectly good system for most tasks.  When running Windows 10 or some basic games (Minecraft, Don't Starve) it still felt pretty snappy.  I wouldn't try running any graphics intensive games on it.  

The SSD from this system was moved to the PN50-4500U (system 2) and replaced with a 2.5" Seagate 300GB 7200RPM hard disk drive that I pulled out of the Chromebook laptop that I pulled the antenna from.  After switching to the mechanical disk drive, the system felt noticeably sluggish.  A solid state drive makes a big difference!  

I'm keeping this system around for schooling.

ASUS VivoMINI UN62


The ASUS VivoMini UN62 is a wonderfully small and quiet bare bones system with very good build quality.  It was this system that gave me confidence in getting the ASUS PN50.  I actually own 3 of these system and use them for different purposes which have changed over time (e.g. media station, always-on server for minecraft, etc).  More recently, however, the Raspberry Pi 4  have replaced the VivoMinis for some of the tasks.

The specs for my UN62s are:
  • Intel i3-4030U (2 cores / 4 threads, 1.9 GHz, 3 MB cache, 15W)
  • 16GB Crucial (2x8 GB DDR3-1600) 204-pin sodimm
  • Samsung 840 EVO 128GB msata3 SDD
  • Intel Network 7260.HMWG WiFi Wireless-AC 7260 H/T Dual Band 2x2 AC+Bluetooth HMC
Two served as the kids' computers until I upgraded their setup.  One was repurposed as the machine for schooling when remote/distance learning was put in place due to covid-19.  This system was replaced by System 3 and its drive and wireless card got moved to that system.

Raspberry Pi 4

  • Broadcom BCM2711, Quad core Cortex-A72 (ARM v8) 64-bit SoC @ 1.5GHz
  • 4G BLPDDR4-3200 SDRAM
  • 2.4 GHz and 5.0 GHz IEEE 802.11ac wireless, Bluetooth 5.0, BLE
  • Gigabit Ethernet
  • 2 USB 3.0 ports; 2 USB 2.0 ports.
  • Raspberry Pi standard 40 pin GPIO header (fully backwards compatible with previous boards)
  • 2 × micro-HDMI ports (up to 4kp60 supported)
  • 2-lane MIPI DSI display port
  • 2-lane MIPI CSI camera port
  • 4-pole stereo audio and composite video port
  • H.265 (4kp60 decode), H264 (1080p60 decode, 1080p30 encode)
  • OpenGL ES 3.0 graphics
  • Micro-SD card slot for loading operating system and data storage
  • 5V DC via USB-C connector (minimum 3A*)
  • 5V DC via GPIO header (minimum 3A*)
  • Power over Ethernet (PoE) enabled (requires separate PoE HAT)
  • Operating temperature: 0 – 50 degrees C ambient
  • Raspberry Pi ICE Tower Cooler, RGB Cooling Fan (excessive but looks cool on the desk).

The Raspberry Pi 4 is a small wonder of a machine that replaces what I originally used the ASUS VivoMini for and is significantly cheaper.


Friday, January 1, 2021

2021 PC - Asus PN50 4800U

Although I was very tempted to build a new desktop PC and get access to all the power goodness of the latest AMD Ryzen, I was hesitant giving up the small form factor that I had with my Shuttle PC DS87.  When the Asus PN50 with the AMD Ryzen 4800U became available I took the plunge.

The specs comparison between the previous and new PCs:

New PC:

  • Ryzen 7 4800U [Zen2] (8 cores / 16 threads, base clock 1.8GHz, max 4.2GHz - 8 GPU cores - RX Vega 8, 15W)
  • 32 GB Crucial DDR4 3200Mhz  RAM (2x16GB)
  • 1TB Samsung 970 EVO Plus (M.2 NVMe interface) SSD
  • 500GB Crucial MX500 SATA SSD (2.5")
  • Intel WIFI 6, BT 5.0

Previous PC:

  • Shuttle PC DS87
  • Intel Core i7-4790S Processor (8M Cache, base clock 3.2 GHz, max 4.0GHz, 65W)
  • Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSD (MZ-75E500B/AM)
  • 2 x Crucial 16GB Kit (8GBx2) DDR3 1600 MT/s (PC3-12800)

There are enough sites giving benchmarks so I'm not going to try to repeat what they've done, but I wanted to have something to show myself a tangible performance improvement.  It is generally during compilation when I wish things would go faster so why not compare compilation between the two systems?  The multi-core (8 vs 4) and multi-thread (16 vs 8) should benefit compilation even if the base clock of the 4800U is 1.8GHz while the i7 is 3.2GHz.  I'm expecting modern CPU is also more efficient per clock cycle then an 6 year old CPU.

I decided to time the compilation of OpenCV using the following

wget -O opencv.zip https://github.com/opencv/opencv/archive/master.zip
unzip opencv.zip 
mkdir -p build && cd build
cmake ../opencv-master/
time cmake --build .

i7 Results

real   28m57.219s
user   26m48.466s
sys     2m01.402s

4800U Results

real     36m48.166s
user     34m54.722s
sys       1m52.574s

How did this happen?  Was it that the 3.2-4.0 GHz too much for the 1.8-4.2GHz to overcome?  It did seem like during compilation all of the i7 cores was running at around 3.6 GHZ, but I suspected that the compiler was not actually taking advantage of all the cores of the 4800U.

I tried again using Ninja which automatically configures the build to use the multi-core CPUs.

make clean
cmake -GNinja ../opencv-master/
time ninja

i7 Results

real	11m28.741s
user	85m39.188s
sys	 3m23.310s

4800U Results

real      6m39.268s
user     99m03.178s
sys       4m8.597s

This result looks more like what I expected.  More of the system cycles were used on both the i7 and 4800U as more cores and threads were utilized but the real time was much shorter.  This just shows that for a lot of consumers fewer cores but faster clock speeds might be better for desktops (laptops and battery life adds another dimension) as they rely on the applications to be programmed to take advantage of the multiple cores.  That's why gamer systems usually will give up more cores for faster clock speeds since games aren't known for utilizing multiple cores.

Friday, November 27, 2020

Building GUI applications with Go (Golang)

Go is my favorite programming language.  I have mostly used it for writing command line programs or server-side services so I was not familiar with using it for writing GUI desktop applications.  Questions about using Go for writing GUI applications come up periodically on Reddit or Hacker News with some saying that Go is not appropriate for GUIs while others argue the opposite.  

I decided to take an existing command line program that I've written and put a graphical interface on it using different GUI frameworks/tool kits that are available.  The program's purpose is very simple.  It checks for the latest stable version of Go for your system and if there is a newer version then it downloads it.  Once the file is downloaded, it verifies the checksum to make sure that it correctly downloaded a good file.  The UI will simply show the information (where to save, the version to download, the checksum) and a button to start downloading.  During download, a progress bar indicates what is happening. 




The code can be found at https://github.com/lazyhacker/getgo with the GUI files in the internal/gui package.  gtk.go and fyne.go are for GTK and Fyne respectfully.

TLDR;

Go is perfectly capable for writing GUI application as far as functionality.  There are tried-and-true toolkits such as GTK and QT and emerging ones such as Fyne and Gio.  The former being more polished but with an extra layer of non-Go code between the app and the graphics layer and a higher learning curve.  The latter's tooling, visualization documentation and functionality are less developed.  

Where all of the options fall behind some other languages is in its developer friendliness in the form of documentation.  

GTK do have a lot of documentation and many users who have posted answers on forums, but the documentation is based on another programming language.  

Gio and Fyne are more lacking in documentation and tutorials so it can be more frustrating for beginners looking to learning or find answers.  Although the general concepts might be more easily understood since they aren't as big of a system as GTK.

All the toolkits I tried rely a lot of providing code examples as a form of teaching, but the examples aren't very well documented or discoverable making them less friendly to developers.

Binary size for my simple program are:
  • command-line only ~6M
  • command-line + GTK GUI ~9M
  • command-line + Fyne GUI ~14M
GTK does require for the shared libraries to be available on the system (they aren't compiled into the above binary) so they will need to be bundled in.

It took me about a day to write the Fyne version and about 2 days for the GTK version.  I also had to spend a day to figure out how to get both to compile and run on Windows.

Choices

There are actually many choices available to Go developers for building GUIs.  I think the perception that Go isn't a "GUI" language could be because:
  1. A number of GUI projects have been abandoned.  
  2. There is no single "blessed" GUI framework from the Go team.  
  3. There are no fully native Go implementation of a GUI toolkit.  
(1) I don't think this is unusual given the complexity of developing a GUI framework that there are a lot of abandoned GUI projects.  What is more important is what is available that is actively maintained since there are always many abandoned projects in any language so don't let the noise give you the wrong impression.

(2) Although there are some languages that comes with a GUI toolkit as part of the language (e.g. Java, Swift), most language don't.  Go can feel like its a "batteries included" language with its rich standard libraries (e.g. it basically comes with a HTTP server) so with no GUI options it could lead to the misconception that the Go team doesn't believe that Go should be used for GUI apps.  However, a lot of languages don't have one either (C/C++, Python, etc.)

(3) This really depends at what is the definition of a fully native Go implementation:  
  • Can app developers everything in Go?  Can it be written in an idiomatic way?
  • Can the entire project just depend on the Go tool chain?
  • Is the whole tech stack built with Go?
With the exception of the tool kits whose philosophy are to combine Go for backend logic and another language for the front-end GUI (e.g. javascript), most of the options let the developer write everything in Go.  The Go tool chain includes cgo for interfacing with C code so even if the toolkit is dependent on C libraries (e.g. OpenGL, GTK/Qt, etc.) the app developer don't really have to deal with other tool chains except maybe have some be installed for cgo to access. 

Having the whole stack be written in Go is not realistic.  With the exception of C, every language at some point have to deal with the lower level of the system (whether it be the graphics subsystem or OS) that is written in another language).  

Personally, as long as everything I write is in Go and only the Go tools are used than that is "native" enough for me.  As an applications layer developer, I don't expect to be working on the GUI underpinnings that would require to combine languages and tool chains.  That means there are plenty of options for Gophers (developers that uses Go ) and my evaluation criteria is on some subjective qualities such as how intuitive it is and whether it is easy to learn/use, and some quantitative attributes such as stability. 

With this in mind, the most often mentioned options are Qt, GTK, Fyne and Gio.  Between Qt and GTK, I chose GTK.  Both are these are popular production-level GUI toolkits written in C/C++.  The reason I picked GTK is that I use Linux and Gnome and the admittedly self-perception that installing GTK is easier then installing a Qt dev environment.

I also wanted to try either Fyne or Gio as these are two toolkits that were built with Go.  Both rely on go-gl (which in turn depends on OpenGL) to deal with the graphics subsystem to draw the interface and widgets so they don't that extra layer between the app and graphics system being occupied by another framework like GTK and Qt.


Gio (https://gioui.org)

The first GUI toolkit that I tried to look at is Gio.  It wholeheartedly embraced everything Go and the latest-and-greatest.  It allows compiling to desktops, mobile, and WASM (web assembly).   It supports Go modules (common) and drops GOPATH support (uncommon).   Its embrace of immediate mode GUI programming is kind of Go-like in its belief that it's not always necessary to give up control to a framework.

Installing Gio to be ready to use is very easy.  Make sure the system already installed the wayland/X11 and OpenGL development libraries.  A simple 1-line install from dnf, apt or whatever Linux package manager will likely suffice.  Then it is a simply importing the package in your Go code and during the first compile, Go will pull down Gio and all its dependencies through Go modules.

However, I quickly moved away from Gio for a couple of reasons:
  1. It lacked documentation to help a new user understand how to use it and I found the existing documentation to poorly organized.  It primarily relies on code examples and API comments and then leaves it to the users to figure out for themselves how to use it.
  2. While immediate mode gives more control to the developers, building GUIs is one area where there's enough complexity that I don't necessarily mind handing it off to a toolkit to take care of things.  I wonder whether immediate mode is actually more useful to developers who build GUI tool kits then developers who use the tool kits.
I might come back to Gio some day when I have more time to learn it.


Fyne (https://fyne.io)

Fyne is a more standard retained-mode GUI tool kit that is easy to install.  It is similar to Gio in that you just need to use your package manager to install the graphical development libraries and then just import Fyne to have it download all the necessary packages and dependencies.  Unlike Gio, it also support the traditional GOPATH method if you don't use Go modules.

The document was much more comprehensive feature an quick beginning walk-through, tutorials and API documentation.  What holds by Fyne a little is in the organization of the documentation which required me to do quite a bit of jumping between sections to understand something.  For example, one section talks about a widget but it is somewhere else where it shows what the widget actually looks like.

I was able to put together a basic interface pretty quickly with Fyne thanks to its basic tutorials.  While getting something working quickly is a plus, I'll admit that I did not find the graphical elements very attractive.  It seems to be embracing material design in some form but feels like it's incomplete.  Extending the look and feel is also difficult at the moment. 

Although I mentioned that I got a working GUI up quickly, I was immediately met with a bug.  The app would start and all the components would draw in the window before it would suddenly turn blank.  Resizing or hovering over a particular widget would bring the interface back.  I only saw this on my Linux system.  I reported this to Fyne and got a response pretty quickly asking me for more info and some follow up questions.  It's good that the Fyne developers are keeping an eye on bug submissions!


Microsoft Windows 

To compile and run on Windows basically requires installing Windows version of gcc.  Fyne's install instructions gives 3 options (msys2+MingW-64, TDM, and Cywin) for GCC.  I went with msys2 + mingw64 as msys2 is also the recommended way to install GTK.

Once msys2 was installed, I installed the mingw64 gcc package through the msys2 shell:

> pacman -S mingw-w64-x86_64-toolchain base-devel

Note: If you want to use the GCC outside of the MingW shell (e.g. with cmd.com), you need to add:
  • C:\msys2\mingw64\bin
  • C:\msys2\usr\bin
to the Windows PATH variable.

After getting MingW GCC installed, I tried to compile and promptly ran into a compilation error about undefined references.  After much digging, I found the solution to be deleting the go-build cache that is in %USERPROFILE%/AppData/Local/go-build.


Gotk3 (https://github.com/gotk3/gotk3)

To use GTK 3 with Go requires gotk3 which provide Go bindings to GTK's APIs (including glib, gdk, and others).  There is also go-gtk which provides GTK 2 binding.

gotk3 has installation instructions on its wiki.  Installing on Linux and MacOS is very simple similar to Gio and Fyne.  For Windows, as with Fyne, it requires installing msys2 + mingw-64.

Although GTK has extensive documentation and tutorials, it's mostly for C or Python.  gotk3's documentation, unfortunately, consists of mainly of comment that is just the C function name.  It doesn't even provide a link to the C documentation so you're required to find it yourself.  For the few APIs that has it's own gotk3 comments, they dropped the C function name so you will have to figure what the Go function maps to. 

gotk3 also follows the "read the code" school of teaching.  Here they have a directory of example code but little explanation of what it is an example of of.  The user is left to decipher all the example codes to form an idea of how the gotk3 works and what is available.   What I ended up doing was to go through and learn GTK in C first and then try to map the concepts to the gotk3 APIs.    This isn't the most friendly way to introduce a Gopher to GUI development with gotk3 but might be okay for a C GTK developer coming to Go.

Of course, on my very first compilation, I get an error message:

go: finding module for package github.com/gotk3/gotk3/gtk
go: found github.com/gotk3/gotk3/gtk in github.com/gotk3/gotk3 v0.5.0
# github.com/gotk3/gotk3/gtk
../../gopath/pkg/mod/github.com/gotk3/gotk3@v0.5.0/gtk/gtk.go:5369:14: _Ctype_struct__GIcon can't be allocated in Go; it is incomplete (or unallocatable)

Fortunately, there was already a bug filed and a solution was already committed.  I had to force Go to use a newer version of gotk3 then what's has been tagged as the stable version:

**Update for go 1.16 4/3/2021**  Another bug came up when trying to use go 1.16 and it looks like a fix made it to the master branch so the same workaround for the previous bug will now also work.

In the project directory:

go get github.com/gotk3/gotk3/gtk@master 




Microsoft Windows

While getting GTK installed on Linux was trivial, it takes a bit more effort on Windows, but the GTK page is very clear.  If you'll be doing everything within msys then there's not much more to do.  If you want to use build in another terminal such as cmd, then you'll need to add the mingw bin path to the %PATH% variable.

There seems to a compatibility issue between gotk3 and GTK that requires remove the -Wl flag from gdk.3.0.pc which is captured in the gotk3 wiki for installing on Windows.