Showing posts with label Programming. Show all posts
Showing posts with label Programming. Show all posts

Tuesday, March 25, 2025

Why the Model Context Protocol (MCP) Is Confusing

**Update 3/26/2025 ** 

OpenAI added MCP support to their agent SDK.


--------------------------------

On Nov. 25, 2024, Anthropic published a blog post about the Model Context Protocol (MCP) .  It didn’t get too much attention at the time because it wasn’t an announcement of much significance.  Suddenly in 2025, MCP has gotten a lot of hype and attention and have also caused a lot of confusion as to why there are so many YouTube face talking about it.

It was nice that Anthropic published how they connect Claude with tools in the Claude Desktop app even if the post was a bit of marketing to sell it as an standard and to encourage an open community.  There is a technical aspect (a protocol) to it, but it felt like it was a business play to get developers to extend Claude with plugins.

Large Language Models like Claude cannot perform any actions.  They’re like a brain with no body.  They might know that an email should be sent, but they can’t actually send the email.  To connect the “thought” (send email) with the action requires basic programming.  Using MCP is how Anthropic does it but it isn’t the only way.  Let’s take a look at various ways that this is accomplished and then see how MCP fits in.

You Are the Agent

In this scenario, a person goes to claude.ai and has a conversation with Claude about writing an email to invite someone to lunch.  Claude generates the email body letting the person know to copy it an email program to send.  The person manually copies that text into their email or calendar app and sends the invitation.  The person is the agent because they are performing the action.

Using an AI Assistant

Here, a person uses an app (this can be a web app, desktop app, mobile app, etc.) such as a personal assistant ala Jarvis from Iron Man.  The user asks Jarvis to send an email to invite someone for lunch.  Jarvis composes the invitation message and sends the invitation through a calendar app so that it also records the event on the calendar.  So how does Jarvis do this?

Method 1

  1. Jarvis sends your question to LLM along with prompts describing available tool(s).
  2. Jarvis looks at the response to determine what tools to use.
  3. Jarvis executes the chosen tool(s) through the tool API
  4. The results are sent back to LLM
  5. LLM formulates a natural language response
  6. The response is displayed to you!

In the early days (2023), this might be done like this:

The user speaks to Jarvis, “Jarvis, invite Thor to lunch next Wednesday.”

The Jarvis code passes the text to an LLM along with an additional prompt:

Respond to the user, but if the request is to add to a calendar then respond in JSON:

  {
    Tool: “calendar”
    Invitee: name
    Date: date
    Time: time
    Body: message
  }

The Jarvis program will get the response and if the response is a tool response it will parse the JSON and call the calendar API.

The Jarvis code calls the LLM again with the text, “tell the user that the invite was successfully sent” and then return the response to the user.

Method 2

  1. Developer registers available tools
  2. Jarvis sends your question to LLM
  3. LLM analyzes the available tools and decides which one(s) to use
  4. Jarvis executes the chosen tool(s) through the tool API
  5. The results are sent back to LLM
  6. LLM formulates a natural language response
  7. The response is displayed to you!

An enhancement was added to many LLM’s API to allow developers to register tools, their purpose and parameters. The Jarvis code will register a tool called “calendar”, give it a description such as “Tool to add, update and remove user’s calendar.”, and what parameters it needed.

Now, when Jarvis passes “Jarvis, invite Thor to lunch next week,” to the LLM, it will respond with JSON and Jarvis can call the calendar API.

The Jarvis code calls the LLM again with the text, “tell the user that the invite was successfully sent” and then return the response to the user.

Method 3 (MCP)

  1. User registers available tools.
  2. Jarvis sends your question to Claude
  3. Claude analyzes the available tools and decides which one(s) to use
  4. Jarvis executes the chosen tool(s) through the MCP server who calls the tool API
  5. The results are sent back to Claude
  6. Claude formulates a natural language response
  7. The response is displayed to you!

With MCP, the user (on desktop/mobile) or developer (on cloud) registers MCP servers with Jarvis.  Jarvis can then get the tools description from the MCP Server which it passes to the LLM.

When Jarvis passes “Jarvis, invite Thor to lunch next week,” to the LLM, the LLM will determine the tool to use.

Jarvis will then call the MCP server to send the calendar invite.

The Jarvis code calls the LLM again with the text, “tell the user that the invite was successfully sent” and then return the response to the user.

Comparison

With the MCP, tool registration is passed to the user and the tool description is handed off to the tool developer, but otherwise the steps remain the same.

Method 2

Method 3

  1. Developer registers available tools
  2. Jarvis sends your question to LLM
  3. LLM analyzes the available tools and decides which one(s) to use
  4. Jarvis executes the chosen tool(s) through the tool API
  5. The results are sent back to LLM
  6. LLM formulates a natural language response
  7. The response is displayed to you!
  1. User registers available tools.
  2. Jarvis sends your question to Claude
  3. Claude analyzes the available tools and decides which one(s) to use
  4. Jarvis executes the chosen tool(s) through the MCP server who calls the tool API
  5. The results are sent back to Claude
  6. Claude formulates a natural language response
  7. The response is displayed to you!

Comparing the different methods shows that the steps are the same but just the implementation is different.  This is one reason there’s a lot of confusion because seems to be very little benefit.

Having a standard protocol can be advantageous but only when all the LLM adopts it otherwise it is just how to interact with Claude.


MCP servers are potentially reusable and might ease integration which is a benefit since it’ll be like having only one API to learn.  This requires wide adoption and availability which isn’t a given even if it is backed by one of the big LLM providers.

Shortcomings of the MCP

As a protocol, there are a lot of shortcomings and technical benefits are minor.  

Some technical shortcomings are:

  • There’s no discovery mechanism other than manual registration of MCP servers.
  • There’s now extra MCP servers in the tech stack that can be achieved by a library.
Thus the main benefits will mainly come if the protocol is adopted as a standard.

Saturday, February 15, 2025

WebAssembly (WASM) with Go (Golang) Basic Example

I first wrote about using Go for WebAssembly (WASM) 6 years ago right before the release of Go 1.11 which was the first time Go supported compiling to WASM.  Go's initial support for WASM had many limitations (some I listed in my initial article) which have since been addressed so I decided to revisit the topic with some updated example of using Go for WASM.

Being able to compile code to WASM now allow:

  • Go programs to run in the browser. 
  • Go functions to be called by JavaScript in the browser.
  • Go code to call JavaScript functions through syscall/js.
  • Go code access to the DOM.

Setup

Go's official Wiki now has an article on the basics of using Go for WASM including how to set the compile target and setup.  

A quick summary of the steps to the process:

Compile to WASM with the output file ending as wasm since it's likely that the mime type set in /etc/mime probably use the wasm extension.

> GOOS=js GOARCH=wasm go build -o <filename>.wasm

Copy the JavaScript support file to your working directory (wherever your http server will serve it from).  It's necessary to use the matching wasm_exec.js for the version of the Go being used so maybe put this as part of the build script.

> cp "$(go env GOROOT)/lib/wasm/wasm_exec.js" .

Then add the following to the html file to load the WASM binary: 

<script src="wasm_exec.js"></script>
<script>
    const go = new Go();
    WebAssembly.instantiateStreaming(fetch("main.wasm"), go.importObject).then((result) => {
                go.run(result.instance);
    });
</script>

Keeping It Running

It is a good starting point but the Go code example is too simplistic.  It only demonstrates that the WASM binary that is created by Go can be loaded by having write a line to the browser's console.  The Go program basically gets loaded by the browser, prints a line and exits.   Most of the time, it's probably desirable to have the WASM binary get loaded and stay running.  This can be achieved by either having having a channel that keeps waiting:

c := make(chan struct{}, 0)
<- c

or even easier:

select {}

Have either of these at the end of main() will keep the program alive after it gets loaded into the browser.

In Place of JavaScript

Being able to access the DOM excites me the most because it allows me to avoid writing JavaScript followed by being able to run Go programs in the browsers.  While I think the inter-op between Go and JavaScript is probably the most practical application, it's not something I've had to do much since I'm not a front-end developing doing optimizations or trying to reuse Go code between the front-end and back-end.

I don't mind using HTML for UI development or even CSS, I'm just personally not a fan of JavaScript.  This isn't to say that it is bad, just I prefer other languages just like some people prefer C++, Java, Python, etc.  I don't have fun writing JavaScript like I do with with Go if though I know JavaScript.

Take a basic example of a web app (index.html) with a button to illustrate:

<!DOCTYPE html>                                                                 
  <html lang="en">                                                                  
  <head>                                                                          
      <meta charset="UTF-8">                                                        
      <meta name="viewport" content="width=device-width, initial-scale=1.0">        
      <title>Example</title>                                         
  </head>                                                                           
  <body>                                                                          
                                                                                  
      <button id="myButton">Click Me</button>                                                                                                     
                                                                                  
  </body>
  </html>

JavaScript is used to attach an event to it so that when the button is clicked, an alert message pops up:

    // Select the button element                                            
    const button = document.getElementById('myButton');                     
                                                                                  
    // Attach an event listener to the button                               
    button.addEventListener('click', function() {                           
       alert('Button clicked!');                                           
    }); 

With WASM, the JavaScript can be replaced with Go code:

 package main                                                                       
                                                                                     
  import (                                                                                                                                                   
                                                                                     
      "honnef.co/go/js/dom/v2"                                                       
  )                                                                                  
                                                                                  
  func main() {                                                                   
                                                                                  
      document := dom.GetWindow().Document()                                      
           
      // Select the button element                                                                       
      button := document.GetElementByID("myButton")                               
                                                                                  
      // Attach an event listener to the button
      button.AddEventListener("click", false, func(dom.Event) {                   
          dom.GetWindow().Alert("Button clicked!")                                
      })                                                                          
                                                                                  
      select {}
  }

In this case, the Go code looks extremely similar to the JavaScript code because I'm using the honnef.co/go/js/dom/v2 package.  It is a Go binding to the JavaScript DOM APIs and I find that it makes it more convenient than using syscall/js directly.

Why do I prefer this over writing JavaScript especially when they look so similar?   The main reason is that most code is not just calling an API.  There's other logic that are implemented and for those, I can use Go and Go's libraries along with the benefits of a compiled and type safe language. 

There are still things that needs to be considered before just using Go and WASM for general consumer production web app.  The binary size can be large so it needs to be considered for your audience, but if I'm doing my own hobby project for myself or corporate apps where I know the user have fast connections, or if app performance and functionality outweighs the initial download and memory usage, I'd try to use it.

Monday, February 10, 2025

The Death of Software Engineering Is Premature

There is a lot to be excited about when it comes to advancement in language models.  The people that benefit the most are software engineers because it can enhance the productivity of knowledgeable engineers. 

While a simple prompt such as "build me a web app with a conversation interface," will result in a functional web app (this is great because building a basic commodity application is now accessible to everyone),  these aren't the type of applications software engineers are normally tasked to write.  Software engineers are tasked with building multi-layered and evolving software that can't be fully described with a few prompts.  To build the software that businesses need requires skilled and knowledgeable people to direct the engineering work of the AI.

I built a simple app to filter news headlines that is small enough to be digestible in a post that I think shows how far a layman can go with building software using LLMs and why software engineers are still needed.  

The First Prompt

Let's start with what many executives might think makes software engineers unnecessary: 

Build me an app that will filter out political articles from Google News.

ChatGPT understood enough to generate an app using keywords to filter the headlines.  One time it created a React app and a second time it used Python.  Both times it used newsapi.org to get the news feeds and require you to understand how to build and run the app.  The main issue is that the app isn't really what I wanted.  It provides news search which the results are then matched with keywords to decide what to filter out.  I wanted the news that I normally see when I visit news.google.com minus the political articles so I tell ChatGPT precisely that:

I don't want to search for news.  I want to see the articles that I see when visiting news.google.com minus political articles

The first time I asked ChatGPT it understood this enough to switch to using the Google News RSS feed which is excellent\!  The second time, it concluded that the way is to scrape news.google.com.  Both of these prompts highlight that some specialized knowledge is needed.  Does the programming language matter?  Should it use RSS or web scraping?  How do you run the code it generates?  Can these questions be ignored? Who in the organization does the CEO expect to be able to answer these questions?

The Second Prompt

While the CEO might not be able to use the AI, it is possible that a non-engineer or someone who doesn't know how to program can know enough to give more details in the prompt that improves on the first prompt.  A technical product manager could give the AI the extra details: 

  1. single page web app using material design   
    1. have a header with the company logo on the left  
    2. have a settings menu on the right side of the header  
    3. ...  
  2. web app will get the list of headlines to display   
  3. pull headlines from RSS feeds and have a LLM return the items that are not political  
  4. build it with language X on the backend and basic Javascript and CSS on the front end.

(4) was included to demonstrate how software often needs to fit in with a business' existing infrastructure.  If the AI returns a React app but there is no infrastructure to build and deploy React apps then we'd be looking at additional costs and efforts to add that ability simply because the AI had no knowledge of what is reasonable for a specific business.

LLMs are capable of generating this app although the generated code didn't actually work and I had to guide it to how to fix things but for now let's assume that the LLM generated a fully working app. 

If we were to stop here then we might conclude that software engineers aren't needed anymore, but I believe the conclusion is actually that for companies that previously did not have engineers that they now have access to some skills their organization never had before.  For example, a small company who has an office manager built a spreadsheet that they're running their business on would benefit because while their spreadsheet worked it is still limited.  The next step to expand the spreadsheet that would previously require some coding knowledge to extend the worksheet can now be done by the office manager with the aid of an AI.  There is cost saving for the company because they didn't have to hire a consultant (these companies probably don't have enough work to hire a full time engineer to be on staff).

Remember that I left out a number of things that the AI cannot do including turning the code into an app and deploying the app, but the biggest hurdle would be if the AI did not meet all the requirements for the app initially.  I had success prompting the AI to make certain changes and enhancements but multiple times the AI would make a change that didn't work and go further-and-further down the rabbit hole and no prompting got it to fix the problem until I have it very specific instructions on how to fix ("In function X, change the max/min values to be Y for variable V").  

Basically, maintaining, fixing and enhancing the app is where the challenge is and that's where most engineers are spending their time.

The Third Prompt

Prompt 3 (in real life was actually what I did first) was the fastest and most productive way towards building a working application and this was to have a design and give the AI many specific implementation instructions.  I knew the architecture, algorithm and code structure and used that knowledge to guide the AI so that it was essentially very fast typist of code:

  • Write a FetchRSS function that takes in a string value for the RSS feed configuration file path.  
  • In FetchRSS, open up the configuration file and loop through each line to get the URL.  
  • For each URL, fetch it and parse the RSS response into a slice of strings  
  • Oh, ignore any lines that are blank or starts with '\#'  
  • ...

Since I enjoyed typing out code, I tended to write my own stuff but for the "boring" codes (e.g. handling when err \!= nil) I'll have the AI write it.

I was able to complete the code much faster, make fewer trips to look up references and documentation and with few bugs from typos.

Here's the problem, though.  While AI is capable of generating valid code, it isn't "working" code.  It isn't code that can be used directly in a business' infrastructure.  AI is still struggling to understand an entire code base and writing code that works within a business requires also understanding the environment the code is running in.  If the code will be run in complete isolation then it might be able to run but even a simple function such as "GetUserName" depends on where the user name is stored?  What integration must the AI be aware of in order to get the user name?  In a real environment, the AI simply just gives code snippets that the engineer must still adapt into the organization's infrastructure.

Conclusion

Having an AI capable of building software for your business is not realistic.  My example app is too simplistic to be any company's product and it still wasn't able to do that.  

Having a knowledgeable person on staff using AI will increase your staff's ability to do their jobs better and allow companies to do things they previously couldn't but most of these things are not things software engineers typically do.   If companies try to have non-engineers do software engineering work with the AI will likely result in decreased productivity.  Any gains from having something done quickly at the beginning will also be quickly overshadowed by the inability to maintain and enhance the software.

It is ultimately a decision companies have to make for themselves.  Is a % reduction in compensating engineers a greater value then expanding by factors the productivity and capabilities of the engineers (reduce cost vs growth)? 

What is complex today will not be as complex tomorrow.  They will become common and AI will be able to take care of those things, but as AI takes over more of the common stuff that will allow engineers to tackle the next level of complexity with more originality because businesses will need it to survive.  There are leaders at technology companies boasting about being able to get rid of their software engineers or how they will stop hiring more engineers.   These leaders are making a choice to make cheap commodity products in exchange for growth and innovation, but they might find themselves racing to the bottom instead of accelerating to the top.  

Can AI someday outpace people?  Maybe.  But it's not now.  Declaring that engineers aren't needed is pre-mature.


Using AI/LLM to Implement a News Headline Filter

There are news topics that I'd rather avoid being bombarded with, but options for filtering headlines on the news sites are generally very limited.  You can tell Google News to block certain sources and there's an "fewer like this" option which doesn't seem to do anything, but neither will block topics (e.g."politics", "reality TV shows", etc. ) so you still end up getting bombarded with things you don't want to see even in your "personalized" views.  Fortunately, sites like Google News provide RSS feeds that makes it easier to get the list of headlines, links and descriptions and avoid trying to scrape websites which can be brittle as the sites can change their layout at any time.

I decided to write my own filter take out things that I'm not interested in and publish the result to a site that I can access from any browser.  The simplest way (and one that doesn't even need site to host the result) is to have a list of blocked words and drop any headlines with those words.  As long as the app can access the RSS feed even a mobile app can easily handle this kind of filtering, but creating and maintaining an effective block list becomes a challenge.  A political news article isn't going to be "Political Article on City Council Votes On Artificial Turf at Parks.  This is a good use case for using a language model to categorize a headline instead of using a keyword filter.

I created a prompt in natural language explaining what I don't want to see along with some additional details to add to common definitions and then sent my instructions with the list of headlines to the language model and have it return a filtered list:
Remove all political headlines.  Headlines that includes political figures, and celebrities who are active in politics such as Elon Musk should also be removed.
Being able to use natural language to handle categorization makes it so much easier to build the app.  The categorization would've been the most challenging part of the project, but the LLM allowed me to get good results very quickly (took just a few minutes to get the first results but some more time to tweak the prompt).  Another benefit with the large language models such as Gemini is that it understand multiple languages so while I gave my instructions in English, it can filter headlines in French, Chinese, Japanese, etc.

Using an language model does mean giving up some control such as relying on what the model interprets "political" to mean.  Prompts can help refine the model's interpretation but sometimes a headline makes it through the filter and it is not as easy to determine compared to being able to see the algorithm and determine the reasoning.

I encountered a problem where the LLM's response stopped midway.  This was because  LLMs have limits on input and output tokens (how much info you can send out and the size of the response it will send back).  The input token limit is so high that I'm not likely to have enough headlines to even remotely approach it's limit (Gemini is 1 million tokens for the free tier and 2 million tokens for the paid tier and I'll be using around 20,000).  The output limit is much smaller (~8k) so if I wanted it to send back the complete details of the filtered headlines (title and link) it won't be able to.  To address this problem, I send the LLM the headlines with an index and have it return just the index.  If it was a 100 headlines then size of the output is less than 200 tokens.

A cron job will execute the program and write the result as JSON to the server where the web page will load the JSON  and display the headlines:


The code is on Github and you can see it is very simple program.

Wednesday, January 1, 2025

Powerline Error in VIM with Git

On Fedora 41, opening a Go file that is checked into Git results an error message that flashes briefly before VIM shows the file.  I'm not sure what is the exact cause but it went away when I uninstalled powerline and reinstalled it.   When uninstalling just powerline, it also uninstalled VIM for some reason the re-install was actually `dnf install vim powerline`.

Saturday, December 28, 2024

Framework 13 (AMD) First Impressions

I recently purchased a Framework 13 (AMD) laptop to replace my 7 years old Razer Stealth as my travel companion and wanted to share my first impressions after taking it on a trip and using it for about a month.

My main criteria for a travel laptop is that it must be light, but as I've gotten older I've added a few more things to look for:

  • Display must support good scaling.   As displays have improve their resolutions, everything shown has gotten smaller so I have to scale it up for it to be comfortable for my eyes. 
  • Replaceable battery.  As evident from using my previous laptop for 7 years and still not feeling the need to upgrade, I tend to keep my laptop for a long time especially since I don't rely on my laptop to be my primary driver.  While most parts of a laptop are capable of lasting awhile, batteries are a different story.  I've had to replace the battery on my Razer twice because they started to swell.  This is not exclusive to Razer as I've had it happen on Macbooks and Pixelbooks as well.
  • Linux support.  I mostly use Linux especially for development so I'm most comfortable with it, but I occasionally do have use for Windows (some games the family plays, camera apps, etc.).  They key reason, though, is that Windows is commercial and I don't want to be forced to pay to upgrade if it is not necessary.  The Razer Stealth ran Windows 10 and Microsoft says it's not compatible with Windows 11 so I either have to live without security updates or try to install Linux when Razer doesn't support Linux in anyway.  Having good Linux support is a way to future proof the laptop somewhat.
Given these criteria, I settled on the Framework.  It is lightweight (1.3 kg/2.9 lb) with a 13.5" 2880x1920 120Hz 2.8K matte display (the uniqueness of Framework is that it is modular so you can have different displays even if new ones are released in the future) that is a 3:2 ratio which allows better scaling especially with Linux.

The battery is replaceable (nearly everything on the Framework is replaceable/up-gradable) and it fully supports Linux (Ubuntu and Fedora being the officially supported ones but seems like most Linux distributions will work) even the finger print sensor.

First Impressions

Ordering the Framework, especially the DIY version, presents you with more choices than the typical ordering process.  Not only do you pick how much disk storage you want, you have options for what brand/specs you want and this is for all the major components:  display & bezel, keyboard, CPU,memory, storage, memory, ports.  If you're familiar with computer components, most of the parts are understandable from the description but for the keyboard it didn't explain what the difference is between US-English, International-English Linux and International-English (International English has the Euro key and Linux swaps out the Windows key with a Super key).

I was impressed how quickly the laptop shipped and arrived given it that it comes directly from Taiwan (where it was manufactured) and that each order have different sets of expansion ports (although I did pick a pretty standard set of ports).  It arrived faster then the 5-7 business days it listed when I ordered.

The packaging was nicely done to prevent anything from shifting during transit and everything is recyclable. The packaging include a screw driver needed to assemble all the components.

It took about 20 minutes to put everything together and the instructions were good.  A lot of people can probably figure it out even without the instructions, but the instructions really prepares you.  It suggests putting on the bezel starting at the bottom first and that definitely allowed it to fit better without fidgeting and it warn that the first boot will take longer so you don't worry about whether it not immediately starting up meant that you did something wrong.

It gives pretty good instructions on installing the operating system.  For Windows, it anticipated that you might not want to use a Microsoft account so it tells you how to bypass it and how to deal with the fact that during installation the laptop drivers aren't there so how do you get pass the part where Microsoft wants to have networking working just to complete the installation.  For Linux, the instructions was decent but maybe a little outdated especially with the screenshot.  Although Fedora is one of the two officially supported distributions, the Ubuntu guides seemed more comprehensive.  They also favor Gnome in their instructions.

You can get the Framework with either an Intel process/motherboard or AMD processor/motherboard and although the general sense that the AMD version performs better there's more information on the Intel stuff.

The display looks very nice with good contrast and brightness.  It's very comfortable to the eyes and scaling in Linux was not a problem.  No complaints about the touch pad and it worked with Linux out-of-the-box.  The keyboard is comfortable with good travel and spacing.  It wasn't too squishy.  If the Thinkpad is the bar, this isn't as good as that, but better then the last Macbook Pro I used . 

The fingerprint sensor worked out-of-the-box as well, but if you aren't using Gnome, you need to use the command-line tool, fprintd-enroll, to register your finger print. 

It's not clear whether Framework thinks you should run tuned-ppd to manage power profiles for both Intel and AMD or whether that's just for Intel and to stick with power-profiles-daemon for AMD.  On Fedora 41, if you install power-profiles-daemon then each time it wants to change the profile (such as when you plug/un-plug the power) SELinux will block it and give you a warning.

Although I had no problems with WIFI, the wifi chip it comes with always seem to have a weaker signal than other devices.  I think some people swap it out with the one that comes with the Intel board so it's something I'm watching out for.

I've been pretty happy with the laptop so far and hope it'll last a long time.  I like the company's mission and hope they continue to succeed with their vision of modular, sustainable and environmentally friendly laptops.

System Configuration

  • AMD Ryzen™ 5 7640U
  • 2880x1920 120Hz 2.8K matte display
  • Crucial RAM 32GB Kit (2x16GB) DDR5 5600MHz
  • WD_BLACK 2TB SN850X NVMe SSD Solid State Drive - Gen4 PCIe, M.2 2280, Up to 7,300 MB/s

Thursday, January 5, 2023

Add Build info to a Go Binary

Having the build info directly in a binary is useful in helping to identify one binary from aonther especially if you do a lot of compilations.  Manually updating that information in to the source before build is cumbersome and error prone so it's better to automate it.

This can be done in Go using -ldflags with the build command.  For example, if you have a main.go file such as this:

package main                                                                    

import "fmt"
var(
    build string
)

func main() {
    fmt.Printf("build date: %v\n", build)                            
 
)
Then you can build it with -ldflags to change the value of build with the current date when using the go build command:
go build -ldflags "-X main.build=`date`" main.go

Be careful that other parts of your program doesn't change the value during runtime since it is just a variable.

To make it a little safer, you can put the variables into another package and don't allow it to be updated.   You can, for example, create a package called "buildinfo"

package buildinfo                                                                      
                                                                                   
var (                                                                              
    builddate = "0"                                                                
)                                                                                  
                                                                                   
func GetBuild() string {                                                           
                                                                                   
    return builddate                                                               
}                                                                                  

that is called by your main.go:

package main                                                                       
                                                                                   
import (                                                                           
    "fmt"                                                                          
                                                                                   
    "example/buildinfo"                                                                
)                                                                                  
                                                                                   
func main() {                                                                      
                                                                                   
    fmt.Printf("build date: %v\n", build.GetBuild())                                                             
                                                                                
}

You will then build your application with:

go build -ldflags="-X 'example/buildinfo.builddate=`date`'"

Running the program will now output something like this:

build date: Thu Jan  5 12:33:38 PM

Friday, April 9, 2021

Keep Go Module Directory Clean with GOMODCACHE

Go makes downloading projects and their dependencies very easy.  In the beginning there was go get which will download the project source code and its dependencies to $GOPATH/src.  With modules, all the dependencies are downloaded to $GOPATH/pkg/mod.  The ease of downloading and the lack of management control in the go command means that is easy for the two directories to grow in size and to lose track of which project led to the download of a particular package.

I recently started to play around with the Fyne UI toolkit.  I didn't initially know what other packages it would download so I wanted to have Fyne and its dependencies in their own area.  The go command has a flag -pkgdir that is shared by the various commands.

The build flags are shared by the build, clean, get, install, list, run, and test commands:

...

-pkgdir dir

install and load all packages from dir instead of the usual locations. For example, when building with a non-standard configuration, use -pkgdir to keep generated packages in a separate location.

This didn't work as I expected because it didn't seem like it did anything at all.  Using the command

go build -pkgdir /tmp

resulted in all the downloaded package still going to $GOPATH/pkg/mod.

What did work (thanks to seankhliao) is to set the GOMODCACHE variable which sets more then the cache location but also the package location:

GOMODCACHE=/tmp go build

All the downloaded dependency packages will now be downloaded to /tmp rather then $GOPATH/pkg/mod.  

Honestly, I'm not really sure what -pkgdir is really suppose to do.  Maybe it is only for things that the build command generates?  Why does it do when using with go get?

Wednesday, April 7, 2021

Local Go module file and Go tools magic

I really value that when working with Go there are no "hidden magic" in source code.  Go source code are essentially WYSIWYG.  You don't see decorators or dependency injections that might change the behaviors after it is compiled or run that requires you to not only have to understand the language and syntax but also having to learn additional tools' behavior on the source code.  While this is true of the language it is not true of the go command for Go's module system.

I've personally found Go modules to be more confusing then the original GOPATH.  I understand that it solves some of the complaints about GOPATH and also addresses the diamond dependency problem, but it also adds complexity to the developer workflow and under-the-hood magic. Maybe that's to be expected when it is going beyond source code management and adding a whole package management layer on top, but I'd be much happier to have to deal with this added complexity and burden if the solution was complete (how about package clean up so my mod directory isn't growing non-stop?)!

Modules adds the go.mod file that tracks all a project's dependencies and their versions.  This introduces a problem when one is developing both applications and libraries since it is possible that the developer have both the released production version and in-development version of libraries locally.  To point your application at the library without constantly changing the import path in source code, the replace directive can be used, but when committing the code it is not ideal to submit the go.mod with the replace directives in it as it will likely break the build for someone else checking out the code and can expose some privacy data (the local path that might contain the user name).

Now developers have to add the replace directives locally, remove them right before submission and then put them back (without typos!).  Fortunately, in Go 1.14, the go commands (build, clean, get, install, list, run, and test) got a new flag '-modfile' which allows developer to tell it to use an alternative go.mod file.  The allows a production version of go.mod file to not have to be modified during development/debug and a local dev version of go.mod that can be excluded from getting committed (i.e. .gitignored).  

This can be done on a per-project level by adding -modfile=go.local.mod to go [build | clean | get | install | list | run | test]:

go build -modfile=go.local.mod main.go

Note that whatever the file name is, it still has to end on .mod since the tool assumes to create a local go.sum mod based on a similar name as the local mod file except with the extension renamed from .mod to .sum.

To apply the use of go.local.mod globally, update "go env":

go env -w GOFLAGS=-modfile=go.local.mod

go env -w will write the -modfile value to where Go looks for its settings:

Defaults changed using 'go env -w' are recorded in a Go environment configuration file stored in the per-user configuration directory, as reported by os.UserConfigDir.

So the flow that Jay Conrad pointed out in this bug thread would be as follows:

  1. Copy go.mod to go.local.mod. 
  2. Add go.local.mod to .gitignore.
  3. Run go env -w GOFLAGS=-modfile=go.local.mod. This tells the go command to use that file by default.
  4. Any any replace and exclude directives or other local edits.
  5. Before submitting and in CI, make sure to test without the local file: go env -u GOFLAGS or just -modfile=. 
  6. Probably also go mod tidy.

Tuesday, April 6, 2021

Listing installed packages on Fedora with DNF

To list the packages that are user installed:

dnf history userinstalled

To list all installed packages:

dnf list installed

Friday, January 1, 2021

2021 PC - Asus PN50 4800U

Although I was very tempted to build a new desktop PC and get access to all the power goodness of the latest AMD Ryzen, I was hesitant giving up the small form factor that I had with my Shuttle PC DS87.  When the Asus PN50 with the AMD Ryzen 4800U became available I took the plunge.

The specs comparison between the previous and new PCs:

New PC:

  • Ryzen 7 4800U [Zen2] (8 cores / 16 threads, base clock 1.8GHz, max 4.2GHz - 8 GPU cores - RX Vega 8, 15W)
  • 32 GB Crucial DDR4 3200Mhz  RAM (2x16GB)
  • 1TB Samsung 970 EVO Plus (M.2 NVMe interface) SSD
  • 500GB Crucial MX500 SATA SSD (2.5")
  • Intel WIFI 6, BT 5.0

Previous PC:

  • Shuttle PC DS87
  • Intel Core i7-4790S Processor (8M Cache, base clock 3.2 GHz, max 4.0GHz, 65W)
  • Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSD (MZ-75E500B/AM)
  • 2 x Crucial 16GB Kit (8GBx2) DDR3 1600 MT/s (PC3-12800)

There are enough sites giving benchmarks so I'm not going to try to repeat what they've done, but I wanted to have something to show myself a tangible performance improvement.  It is generally during compilation when I wish things would go faster so why not compare compilation between the two systems?  The multi-core (8 vs 4) and multi-thread (16 vs 8) should benefit compilation even if the base clock of the 4800U is 1.8GHz while the i7 is 3.2GHz.  I'm expecting modern CPU is also more efficient per clock cycle then an 6 year old CPU.

I decided to time the compilation of OpenCV using the following

wget -O opencv.zip https://github.com/opencv/opencv/archive/master.zip
unzip opencv.zip 
mkdir -p build && cd build
cmake ../opencv-master/
time cmake --build .

i7 Results

real   28m57.219s
user   26m48.466s
sys     2m01.402s

4800U Results

real     36m48.166s
user     34m54.722s
sys       1m52.574s

How did this happen?  Was it that the 3.2-4.0 GHz too much for the 1.8-4.2GHz to overcome?  It did seem like during compilation all of the i7 cores was running at around 3.6 GHZ, but I suspected that the compiler was not actually taking advantage of all the cores of the 4800U.

I tried again using Ninja which automatically configures the build to use the multi-core CPUs.

make clean
cmake -GNinja ../opencv-master/
time ninja

i7 Results

real	11m28.741s
user	85m39.188s
sys	 3m23.310s

4800U Results

real      6m39.268s
user     99m03.178s
sys       4m8.597s

This result looks more like what I expected.  More of the system cycles were used on both the i7 and 4800U as more cores and threads were utilized but the real time was much shorter.  This just shows that for a lot of consumers fewer cores but faster clock speeds might be better for desktops (laptops and battery life adds another dimension) as they rely on the applications to be programmed to take advantage of the multiple cores.  That's why gamer systems usually will give up more cores for faster clock speeds since games aren't known for utilizing multiple cores.

Friday, November 27, 2020

Building GUI applications with Go (Golang)

Go is my favorite programming language.  I have mostly used it for writing command line programs or server-side services so I was not familiar with using it for writing GUI desktop applications.  Questions about using Go for writing GUI applications come up periodically on Reddit or Hacker News with some saying that Go is not appropriate for GUIs while others argue the opposite.  

I decided to take an existing command line program that I've written and put a graphical interface on it using different GUI frameworks/tool kits that are available.  The program's purpose is very simple.  It checks for the latest stable version of Go for your system and if there is a newer version then it downloads it.  Once the file is downloaded, it verifies the checksum to make sure that it correctly downloaded a good file.  The UI will simply show the information (where to save, the version to download, the checksum) and a button to start downloading.  During download, a progress bar indicates what is happening. 




The code can be found at https://github.com/lazyhacker/getgo with the GUI files in the internal/gui package.  gtk.go and fyne.go are for GTK and Fyne respectfully.

TLDR;

Go is perfectly capable for writing GUI application as far as functionality.  There are tried-and-true toolkits such as GTK and QT and emerging ones such as Fyne and Gio.  The former being more polished but with an extra layer of non-Go code between the app and the graphics layer and a higher learning curve.  The latter's tooling, visualization documentation and functionality are less developed.  

Where all of the options fall behind some other languages is in its developer friendliness in the form of documentation.  

GTK do have a lot of documentation and many users who have posted answers on forums, but the documentation is based on another programming language.  

Gio and Fyne are more lacking in documentation and tutorials so it can be more frustrating for beginners looking to learning or find answers.  Although the general concepts might be more easily understood since they aren't as big of a system as GTK.

All the toolkits I tried rely a lot of providing code examples as a form of teaching, but the examples aren't very well documented or discoverable making them less friendly to developers.

Binary size for my simple program are:
  • command-line only ~6M
  • command-line + GTK GUI ~9M
  • command-line + Fyne GUI ~14M
GTK does require for the shared libraries to be available on the system (they aren't compiled into the above binary) so they will need to be bundled in.

It took me about a day to write the Fyne version and about 2 days for the GTK version.  I also had to spend a day to figure out how to get both to compile and run on Windows.

Choices

There are actually many choices available to Go developers for building GUIs.  I think the perception that Go isn't a "GUI" language could be because:
  1. A number of GUI projects have been abandoned.  
  2. There is no single "blessed" GUI framework from the Go team.  
  3. There are no fully native Go implementation of a GUI toolkit.  
(1) I don't think this is unusual given the complexity of developing a GUI framework that there are a lot of abandoned GUI projects.  What is more important is what is available that is actively maintained since there are always many abandoned projects in any language so don't let the noise give you the wrong impression.

(2) Although there are some languages that comes with a GUI toolkit as part of the language (e.g. Java, Swift), most language don't.  Go can feel like its a "batteries included" language with its rich standard libraries (e.g. it basically comes with a HTTP server) so with no GUI options it could lead to the misconception that the Go team doesn't believe that Go should be used for GUI apps.  However, a lot of languages don't have one either (C/C++, Python, etc.)

(3) This really depends at what is the definition of a fully native Go implementation:  
  • Can app developers everything in Go?  Can it be written in an idiomatic way?
  • Can the entire project just depend on the Go tool chain?
  • Is the whole tech stack built with Go?
With the exception of the tool kits whose philosophy are to combine Go for backend logic and another language for the front-end GUI (e.g. javascript), most of the options let the developer write everything in Go.  The Go tool chain includes cgo for interfacing with C code so even if the toolkit is dependent on C libraries (e.g. OpenGL, GTK/Qt, etc.) the app developer don't really have to deal with other tool chains except maybe have some be installed for cgo to access. 

Having the whole stack be written in Go is not realistic.  With the exception of C, every language at some point have to deal with the lower level of the system (whether it be the graphics subsystem or OS) that is written in another language).  

Personally, as long as everything I write is in Go and only the Go tools are used than that is "native" enough for me.  As an applications layer developer, I don't expect to be working on the GUI underpinnings that would require to combine languages and tool chains.  That means there are plenty of options for Gophers (developers that uses Go ) and my evaluation criteria is on some subjective qualities such as how intuitive it is and whether it is easy to learn/use, and some quantitative attributes such as stability. 

With this in mind, the most often mentioned options are Qt, GTK, Fyne and Gio.  Between Qt and GTK, I chose GTK.  Both are these are popular production-level GUI toolkits written in C/C++.  The reason I picked GTK is that I use Linux and Gnome and the admittedly self-perception that installing GTK is easier then installing a Qt dev environment.

I also wanted to try either Fyne or Gio as these are two toolkits that were built with Go.  Both rely on go-gl (which in turn depends on OpenGL) to deal with the graphics subsystem to draw the interface and widgets so they don't that extra layer between the app and graphics system being occupied by another framework like GTK and Qt.


Gio (https://gioui.org)

The first GUI toolkit that I tried to look at is Gio.  It wholeheartedly embraced everything Go and the latest-and-greatest.  It allows compiling to desktops, mobile, and WASM (web assembly).   It supports Go modules (common) and drops GOPATH support (uncommon).   Its embrace of immediate mode GUI programming is kind of Go-like in its belief that it's not always necessary to give up control to a framework.

Installing Gio to be ready to use is very easy.  Make sure the system already installed the wayland/X11 and OpenGL development libraries.  A simple 1-line install from dnf, apt or whatever Linux package manager will likely suffice.  Then it is a simply importing the package in your Go code and during the first compile, Go will pull down Gio and all its dependencies through Go modules.

However, I quickly moved away from Gio for a couple of reasons:
  1. It lacked documentation to help a new user understand how to use it and I found the existing documentation to poorly organized.  It primarily relies on code examples and API comments and then leaves it to the users to figure out for themselves how to use it.
  2. While immediate mode gives more control to the developers, building GUIs is one area where there's enough complexity that I don't necessarily mind handing it off to a toolkit to take care of things.  I wonder whether immediate mode is actually more useful to developers who build GUI tool kits then developers who use the tool kits.
I might come back to Gio some day when I have more time to learn it.


Fyne (https://fyne.io)

Fyne is a more standard retained-mode GUI tool kit that is easy to install.  It is similar to Gio in that you just need to use your package manager to install the graphical development libraries and then just import Fyne to have it download all the necessary packages and dependencies.  Unlike Gio, it also support the traditional GOPATH method if you don't use Go modules.

The document was much more comprehensive feature an quick beginning walk-through, tutorials and API documentation.  What holds by Fyne a little is in the organization of the documentation which required me to do quite a bit of jumping between sections to understand something.  For example, one section talks about a widget but it is somewhere else where it shows what the widget actually looks like.

I was able to put together a basic interface pretty quickly with Fyne thanks to its basic tutorials.  While getting something working quickly is a plus, I'll admit that I did not find the graphical elements very attractive.  It seems to be embracing material design in some form but feels like it's incomplete.  Extending the look and feel is also difficult at the moment. 

Although I mentioned that I got a working GUI up quickly, I was immediately met with a bug.  The app would start and all the components would draw in the window before it would suddenly turn blank.  Resizing or hovering over a particular widget would bring the interface back.  I only saw this on my Linux system.  I reported this to Fyne and got a response pretty quickly asking me for more info and some follow up questions.  It's good that the Fyne developers are keeping an eye on bug submissions!


Microsoft Windows 

To compile and run on Windows basically requires installing Windows version of gcc.  Fyne's install instructions gives 3 options (msys2+MingW-64, TDM, and Cywin) for GCC.  I went with msys2 + mingw64 as msys2 is also the recommended way to install GTK.

Once msys2 was installed, I installed the mingw64 gcc package through the msys2 shell:

> pacman -S mingw-w64-x86_64-toolchain base-devel

Note: If you want to use the GCC outside of the MingW shell (e.g. with cmd.com), you need to add:
  • C:\msys2\mingw64\bin
  • C:\msys2\usr\bin
to the Windows PATH variable.

After getting MingW GCC installed, I tried to compile and promptly ran into a compilation error about undefined references.  After much digging, I found the solution to be deleting the go-build cache that is in %USERPROFILE%/AppData/Local/go-build.


Gotk3 (https://github.com/gotk3/gotk3)

To use GTK 3 with Go requires gotk3 which provide Go bindings to GTK's APIs (including glib, gdk, and others).  There is also go-gtk which provides GTK 2 binding.

gotk3 has installation instructions on its wiki.  Installing on Linux and MacOS is very simple similar to Gio and Fyne.  For Windows, as with Fyne, it requires installing msys2 + mingw-64.

Although GTK has extensive documentation and tutorials, it's mostly for C or Python.  gotk3's documentation, unfortunately, consists of mainly of comment that is just the C function name.  It doesn't even provide a link to the C documentation so you're required to find it yourself.  For the few APIs that has it's own gotk3 comments, they dropped the C function name so you will have to figure what the Go function maps to. 

gotk3 also follows the "read the code" school of teaching.  Here they have a directory of example code but little explanation of what it is an example of of.  The user is left to decipher all the example codes to form an idea of how the gotk3 works and what is available.   What I ended up doing was to go through and learn GTK in C first and then try to map the concepts to the gotk3 APIs.    This isn't the most friendly way to introduce a Gopher to GUI development with gotk3 but might be okay for a C GTK developer coming to Go.

Of course, on my very first compilation, I get an error message:

go: finding module for package github.com/gotk3/gotk3/gtk
go: found github.com/gotk3/gotk3/gtk in github.com/gotk3/gotk3 v0.5.0
# github.com/gotk3/gotk3/gtk
../../gopath/pkg/mod/github.com/gotk3/gotk3@v0.5.0/gtk/gtk.go:5369:14: _Ctype_struct__GIcon can't be allocated in Go; it is incomplete (or unallocatable)

Fortunately, there was already a bug filed and a solution was already committed.  I had to force Go to use a newer version of gotk3 then what's has been tagged as the stable version:

**Update for go 1.16 4/3/2021**  Another bug came up when trying to use go 1.16 and it looks like a fix made it to the master branch so the same workaround for the previous bug will now also work.

In the project directory:

go get github.com/gotk3/gotk3/gtk@master 




Microsoft Windows

While getting GTK installed on Linux was trivial, it takes a bit more effort on Windows, but the GTK page is very clear.  If you'll be doing everything within msys then there's not much more to do.  If you want to use build in another terminal such as cmd, then you'll need to add the mingw bin path to the %PATH% variable.

There seems to a compatibility issue between gotk3 and GTK that requires remove the -Wl flag from gdk.3.0.pc which is captured in the gotk3 wiki for installing on Windows.

Saturday, March 14, 2020

Google App Engine's Missed Opportunity

I've been a fan of Google's App Engine (GAE) since its initial release in 2008 but it has never quite taken off despite the growth of running applications in the cloud and the rise of open source software.  It's really a missed opportunity for Google.

I have been running many small projects on GAE which is now part of Google Cloud's offerings.  GAE is friendlier to start with than other hosting options from Google in that it has a free tier which I suspect is sufficient for most users.  GAE auto-scales as traffic increases so there is a possibility that it could surpass the free quota but users can set a guidance on the max daily spend.  This has generally worked for me as I set the max to be $0.00 so that I don't go past the free quota.  Be aware that this is not a hard limit so there is a chance that it can go over the limit.  Recently, I got billed $0.01 requiring me to log in to Google Cloud and pay the amount due.  Since I had to log into the developer console, it gave me a chance to look at the projects that I've been running.  The majority were simple static websites which as simple as GAE is to use, it's easier to use something like Github pages.  Both offers SSL (HTTPS support) and custom domains so I decided to move my sites off of GAE.

This move got me thinking about the missed opportunity for Google with GAE.  It is not because GAE should be a static web hosting site since GAE is about running applications hosted in the cloud.  GAE offers a simple and complete solution that was perfect for users of open source projects. 

Just as Github Pages is a super simple solution to host static web pages, GAE started as a super simple solution for running cloud applications.  GAE is basically a server, database, memory cache, sign-in and storage solution all-in-one.  Users don't have to select and install each of these basic components themselves.  This meant that an open source project could be developed where the user can easily run it by putting it on GAE with the same simplicity of desktop projects (possibly even easier).  I imaged a world where someone can write a note taking app in App Engine and anyone who wants to use it get the source, put it on GAE and it's running and ready to use!   We see note taking programs all the time running on desktops and mobile because the author knows that if the user installs the binary they can start using the app, but for cloud apps it always involves a lot of infrastructure setup.  The reaction to this has been Docker containers which I find is still harder on the user and a lot more complex for the developer.

When GAE was first launched it confused developers who weren't used to this paradigm for web development and Google didn't do a very good job explaining or addressing some missing/problematic areas.  It seems like Google focused more on Enterprises to switch to this "Platform-As-A-Service" model when they have less need for such hand-holding.  I believe the missed opportunity is that they missed out that this was more ideal for the consumer market then the enterprise market.

Sunday, June 10, 2018

Go with WebAssembly Early Examples

Update:  I've since written an updated article about Go and WASM since a lot of have changed in the time since this article was posted.

WebAssembly (WASM) is the most exciting technology in web development in a long time from my perspective.  It represents the first true steps in breaking the monopoly of Javascript as the language for the browser even though the WASM folks emphasize that isn't the intention.  In my previous post, I mention that Go will support compiling to WebAssembly in version 1.11.  This post shares some of my experimentation with Go's WASM support and to see what might be possible.  Let me emphasize that the code is very hacky.  I wrote them quickly mainly so I can try out WASM. 


There are multiple ways to use WASM for web applications:

  1. Optimization for JavaScript-driven applications - This is the scenario most emphasized for the current level functionality provided by the initial (MVP) release of WASM.  Instead of writing everything in pre-compiled Javascript, optimized WASM modules are called by JavaScript.  These modules can be written in JavaScript or other languages that can compile to WASM.
  2. Porting existing code bases over to the browser - This is most often demonstrated with porting existing C/C++ code bases over to WASM through Emscripten and running what might previously been an exclusively desktop/native app (e.g. Autocad, games, etc.).
  3. Writing an web app completely in a non-Javascript language - This is what I'm most excited about!  Instead of having Javascript be the primary language, the application is completely written in another language such as Go.  At the time of this writing, this isn't completely possible since WASM doesn't support WASM modules from directly calling the browser APIs (e.g. DOM APIs, XHR, etc.).  This will be coming and is currently covered under the Host Binding proposal for WASM.  Until then, the Javascript can be abstracted away in a language-specific binding.
The current state of Go WASM doesn't fit the first scenario very well.  There doesn't seem to be a way to expose methods individually for Javascript to call directly.  The Javascript code can execute each Go program's main function and when main() is done the module is done running.  Given that each WASM module has the complete Go runtime including garbage collection, I'm not sure how practical it is to have a bunch of Go programs that are essentially meant to be single methods to the Javascript code.

Go WASM does provide the ability to define callback methods and attach callbacks to browser events.  This requires that the module is running because once it exists it's not longer available to the browser call call.  Because of this, Go WASM is currently most appropriate for scenarios 2 & 3.

The browser executes Javascript and WASM in a single threaded manner.  When the WASM module is running, the browser's front end will block until it is completed.  This will appear to the end user as if the browser is frozen: no way to type input, click button, etc.  With Go WASM, control is given back to the browser when using time.Sleep or when the Go code is blocked.  This has to be done manually by the developer so it's important to keep it in mind.  Unless the apps expect to have completely control of everything in the browser while it is running developers need to remember to periodically sleep itself to give some control back to the browser.   I hope this gets improved in the future because it makes the code ugly, error prone and honestly isn't something developers should need to do in a multi-tasking environment.

Example 1

Complete source is here. See it running here.  

This is a very basic example of writing a Go WASM module that defines a method that gets attached to a HTML button's click event, change some attributes of the document's elements, waits a few seconds then executes a Go routine that draws to the canvas in the browser.  The keepalive() function prevents the module from quitting until it gets a quit signal that happens when the Quit button is clicked and the cbQuit method is invoked because the button click event is triggered.  Also note that when the browser's Alert dialog is triggered, it blocks everything until it is dismissed whether it's called by JavaScript or by the Go code.

The keepalive() can also simply be
select {}
if the intention is just not let the module exit.

--- ---
The basics of having Go drive the web page (rather then it being Javascript) is all there.

Example 2

Complete source here.  Running example here.

This example was inspired by what was presented at GopherCon by Hana Kim when gomobile was first announced.


Instead of re-writing Rob Pike's Ivy Interpreter for Android/iOS/command-line the example showed that the existing Go library was reused.  Just like her example, this example shows that existing Go libraries can be used in WASM.

--- ---

The resulting WASM module is about 4MB but when compressed, the whole web app was only about 1 MB.

This example really doesn't show anything that the previous example didn't already show. I wanted to try against a larger code base and didn't want to write it myself.  It mainly just demonstrates an existing Go library being used with no changes needed to make it work.

The Ivy code is easy to port but if something is expecting to access a file system or make HTTP calls then the browser environment doesn't match.  There isn't any standard file system and outbound calls uses XHRHttpRequest rather then the current HTTP package.

Conclusions


It is possible to use Go to write web applications even though it currently still needs to rely on JavaScript to access the brower's APIs.  Things like Go routines work but developers have to handle sleeping themselves in order to not block the UI.  However, the possibility of ending JavaScript's monopoly for being the browser's language is definitely there!

Browsers themselves still need to do some optimizations.  Each WASM module's byte code still has to be compiled so there's always a delay before the module starts running.  It still starts faster then JavaScript but it'd be much better if the browsers caches the compiled WASM the first time.  My understanding that it'll likely come in the near future but not there yet.

I'm very excited to see this land in Go 1.11 and look forward to see what other developers use it for!

Sunday, May 20, 2018

WebAssembly (WASM) with Go

This is the first time that I've tried to compile the Go compiler so I figure that I'll document it here for others who might be wanting to try WASM with Go before the official Go release in August.

The next release of Go (1.11) is scheduled for Aug and it will be first time WebAssembly (WASM) will be supported.  WASM will be a build target much like how one builds a binary for other machine and operating systems except that in this case the machine is a "virtual" machine that runs inside the browser.

I've been very excited for WASM and I've been eagerly waiting for the time when I can use Go to write WASM.  The branch for the 1.11 release has is now locked for only bug fixes so I figured that it might be a good time to try writing a WASM code with Go.

Note that this first release for WASM isn't going to be "production" ready.  I don't think it'll be very optimized and WASM itself is only at its MVP release it doesn't allow WASM code direct access to the DOM or the browser APIs so all calls is still going through JavaScript.

Unfortunately, as of this writing, the code in the Go repo doesn't work for WASM.  The work on WASM for Go is primarily being done by Richard Musiol (neelance), the author of GopherJS, so I decided to get the version Go from this GitHub repo.

First, you will need a version of Go on your system to compile the source.  The easiest way is to download and install the binary distribution from the main Go website.

Until 1.11 is released to get WASM is to build from source.

git clone https://go.googlesource.com/go
cd go
Once you have the source, it's time to compile it.

cd src
./all.bash
If everything is built correctly then there will be a new "go" binary in ../bin.  Also, there are two files to help you get started with loading future WASM files in ../misc/wasm.

Create a test.go file:

package main

func main() {
    println("hello, wasm")
}
Compile to WASM with:

GOOS=js GOARCH=wasm go build -o test.wasm test.go
Make sure you're running the go command that you've just built and not the version that you installed originally.

In the same directory with your test.wasm file, copy over wasm_exec.html and wasm_exec.js from misc/wasm.

The browser will expect the test.wasm file to be served with a MIME of application/wasm so you'll need to add the following to your /etc/mime.types file:

application/wasm wasm
To run it in your browser, you'll need to have a web server.  Creating one with Go is easy, or you can also use Python with the following command.

python -m SimpleHTTPServer

Point your browser to your web server and load wasm_exec.html.  Click on the button and you should see "hello, wasm" appear in the browser's console.

I've been experimenting with WASM here.

Sunday, July 16, 2017

My First GopherCon


Being a member of the Go team allowed me to attend my first GopherCon in Denver from July 13 to July 15. This is the 4th GopherCon to be held and previously I’ve only watched the recordings of the sessions on the Gopher Academy Youtube Channel.